SlideShare a Scribd company logo
1 of 70
Download to read offline
Surya Saha
Sol Genomics Network (SGN)
Boyce Thompson Institute, Ithaca, NY
ss2489@cornell.edu // @SahaSurya
BTI PGRP Intership Program 2015
http://www.acgt.me/blog/2015/3/7/next-generation-sequencing-must-die
Hello Experiment!
• Experimental design for survey
Sample size
Locations
Phenotypes
6/11/2015 BTI PGRP SummerInternshipProgram2015 2
Early Blight infected tomato plants
http://www.longislandhort.cornell.edu/vegpath/photos/early_blight.htm
Hello Experiment!
• Experimental design for survey
Sample size
Locations
Phenotypes
• Experimental design to identify
genetic differences
PCR-based
• Simple Sequence Repeats
• Other markers
Sequencing-based
• Genes of interest
• Single Nucleotide Polymorphisms
• Gene expression
• Genotyping by Sequencing
6/11/2015 BTI PGRP SummerInternshipProgram2015 3
Early Blight infected tomato plants
http://www.longislandhort.cornell.edu/vegpath/photos/early_blight.htm
Why Sequencing?
• Targeted interrogation
of genome
• Economical
• Technological
developments
• High-throughput assays
• But requires subsequent
validation
6/11/2015 BTI PGRP SummerInternshipProgram2015 4
Why Sequencing?
• Targeted interrogation
of genome
• Economical
• Technological
developments
• High-throughput assays
• But requires subsequent
validation
6/11/2015 BTI PGRP SummerInternshipProgram2015 5
1953
DNA
Structure
discovery
1977
2012
Sanger DNA sequencing by
chain-terminating inhibitors
1984
Epstein-Barr
virus
(170 Kb)
1987
Abi370
Sequencer
1995
2001
Homo
sapiens
(3.0 Gb)
2005
454
Solexa
Solid
2007
2011
Ion
Torrent
PacBio
Haemophilus
influenzae
(1.83 Mb)
2013
Slide designcredit: AurelianoBombarely
Sequencing: Then and Now
Illumina
Illumina
Hiseq X
454
6/11/2015 BTI PGRP SummerInternshipProgram2015 6
Pinus
taeda
(24 Gb)
2014
Nanopore
MinION
First generation sequencing
6/11/2015 BTI PGRP SummerInternshipProgram2015 7
Sanger. Annu Rev Biochem. 1988;57:1-28.
Thanks to Nick Loman for the mention
Maxam-Gilbert method (1973)
6/11/2015 BTI PGRP SummerInternshipProgram2015 8
Maxam-Gilbert method (1973)
6/11/2015 BTI PGRP SummerInternshipProgram2015 9
http://en.wikipedia.org/wiki/File:Maxam-
Gilbert_sequencing_en.svg
https://www.nationaldiagnostics.com/electrophoresis
/article/maxam-gilbert-sequencing
Sanger method (1977)
6/11/2015 BTI PGRP SummerInternshipProgram2015 10
Frederick Sanger
13 Aug 1918 – 19 Nov 2013
Won the Nobel Prize for Chemistry in 1958 and
1980. Published the dideoxy chain termination
method or “Sanger method” in 1977
http://dailym.ai/1f1XeTB
Sanger method (1977)
6/11/2015 BTI PGRP SummerInternshipProgram2015 11
http://en.wikipedia.org/wiki/File:Sanger-sequencing.svg
http://en.wikipedia.org/wiki/File:
Radioactive_Fluorescent_Seq.jpg
First generation sequencing
• Very high quality sequences (99.999% or Q50)
• Very low throughput
6/11/2015 BTI PGRP SummerInternshipProgram2015 12
Run Time Read Length Reads / Run
Total
nucleotides
sequenced
Cost / MB
Capillary
Sequencing
(ABI3730xl)
20m-3h 400-900 bp 96 or 384 1.9-84 Kb $2400
http://www.hindawi.com/journals/bmri/2012/251364/tab1/
Next generation sequencing
6/11/2015 BTI PGRP SummerInternshipProgram2015 13
6/11/2015 BTI PGRP SummerInternshipProgram2015 14
https://twitter.com/kbradnam/status/443153578429923328
• Second generation
• Third generation
• Fourth generation
• Next-next-generation
• Next-next-next
generation
http://www.acgt.me/blog/2015/3/10/next-generation-sequencing-must-diepart-2
Mention the specific technology
used to generate the data
– Illumina Hiseq/Miseq/NextSeq
– Pacific Biosciences RS1/RSII
– Ion Torrent Proton/PGM
– SOLiD
– Oxford Nanopore Minion
6/11/2015 BTI PGRP SummerInternshipProgram2015 15
http://www.acgt.me/blog/2015/3/10/next-generation-sequencing-must-
diepart-2
454 Pyrosequencing
One purified DNA
fragment, to one bead, to
one read.
6/11/2015 BTI PGRP SummerInternshipProgram2015 16
http://www.genengnews.com/
GS FLX
Titanium
https://mariamuir.com/wp-
content/uploads/2013/04/rip.gif
Illumina
6/11/2015 BTI PGRP SummerInternshipProgram2015 17
Output 0.3-15 Gb 20-120 GB 10-1500 GB 900-1800GB
Number
of Reads/
Flow cell
25 Million 130-400 Million 300 million – 2.5 Billion 3 Billion
Read
Length
2x300 bp 2x150 bp 2x250 - 2x125 bp 2x150 bp
Cost $99K $250K $740K $10M(10 units)
Source:Illumina
2500
3000
4000
500
Illumina
6/11/2015 BTI PlantBioinformaticsCourse 2015 18
Output 0.3-15 Gb 20-120 GB 10-1500 GB 900-1800GB
Number
of Reads/
Flow cell
25 Million 130-400 Million 300 million – 2.5 Billion 3 Billion
Read
Length
2x300 bp 2x150 bp 2x250 - 2x125 bp 2x150 bp
Cost $99K $250K $740K $10M(10 units)
Source:Illumina
2500
3000
4000
$1000 human
genome??
500
Illumina
6/11/2015 BTI PGRP SummerInternshipProgram2015 19
Mardis 2008. Annu. Rev. Genomics Hum. Genet. 2008. 9:387–402
Illumina
6/11/2015 BTI PGRP SummerInternshipProgram2015 20
Mardis 2008. Annu. Rev. Genomics Hum. Genet. 2008. 9:387–402
Illumina:TruSeqLongRead
6/11/2015 BTI PGRP SummerInternshipProgram2015 21
Voskoboynik eLife2013;2:e00569
Pacific Biosciences SMRT sequencing
Single Molecule Real
Time sequencing
6/11/2015 BTI PGRP SummerInternshipProgram2015 22
http://smrt.med.cornell.edu/images/pacbio_library_prep-1.gif
Pacific Biosciences SMRT sequencing
Error correction methods
6/11/2015 BTI PGRP SummerInternshipProgram2015 23
Hierarchical genome-assembly
process (HGAP)
Englishetal., PLOSOne.2012
PBJelly
Pacific Biosciences SMRT sequencing
Error correction methods
6/11/2015 BTI PGRP SummerInternshipProgram2015 24
PBcRPipeline
6/11/2015 BTI PGRP SummerInternshipProgram2015 25
Pacific Biosciences SMRT sequencing
Read Lengths
http://www.igs.umaryland.edu/labs/grc/
Mean Read Length: 8391 bp
Maximum Subread Length: 24585 bp
6/11/2015 BTI PGRP SummerInternshipProgram2015 26
Pacific Biosciences SMRT sequencing
Read Lengths
Genome Assembly with Long Reads
6/11/2015 BTI PGRP SummerInternshipProgram2015 27
Oxford Nanopore
6/11/2015 BTI PGRP SummerInternshipProgram2015 28
https://www.nanoporetech.com/
http://erlichya.tumblr.com/post/66376172948/hands-on-
experience-with-oxford-nanopore-minion
http://halegrafx.com/vector-art/free-vector-despicable-me-minions/
Oxford Nanopore
6/11/2015 BTI PGRP SummerInternshipProgram2015 29
Oxford Nanopore
6/11/2015 BTI PGRP SummerInternshipProgram2015 30
https://theconversation.com/how-a-small-backpack-for-fast-genomic-sequencing-is-helping-
combat-ebola-41863
6/11/2015 BTI PGRP SummerInternshipProgram2015 31
Sequencing Trends
6/11/2015 BTI PGRP SummerInternshipProgram2015 32
https://www.google.com/trends/
6/11/2015 BTI PGRP SummerInternshipProgram2015 33
0
5000
10000
15000
20000
25000
30000
2008 2009 2010 2011 2012 2013 2014
Number of Publications
Illumina Pacific Biosciences Roche 454 Ion Torrent
-2000
-1000
0
1000
2000
3000
4000
5000
6000
2009 2010 2011 2012 2013 2014
Increasein Number of Publications
Illumina Pacific Biosciences Roche 454 Ion Torrent
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
2009 2010 2011 2012 2013 2014
% Increasein Number of Publications
Pacific Biosciences Roche 454 Ion Torrent
Hi-C Crosslinking
6/11/2015 BTI PGRP SummerInternshipProgram2015 34
Others
• Ion Torrent Proton/PGM
• SOLiD
• Helicos
• Supporting technologies
– BioNano
– Nabsys
– OpGen
– 10X Genomics
– Fluidigm
6/11/2015 BTI PGRP SummerInternshipProgram2015 35
Comparison
6/11/2015 BTI PGRP SummerInternshipProgram2015 36
Next generation sequencing
6/11/2015 BTI PGRP SummerInternshipProgram2015 37
Run Time Read Length Quality
Total
nucleotides
sequenced
Cost/MB
454
Pyrosequencing
24h 700 bp Q20-Q30 1 GB $10
Illumina Miseq 27h 2x300bp > Q30 15 GB $0.15
Illumina Hiseq
2500
1 - 10days 2x250bp >Q30 3000 GB $0.05
Ion torrent 2h 400bp >Q20 50MB-1GB $1
Pacific
Biosciences
30m - 4h 10kb - >40kb
>Q50 consensus
>Q10 single
500 - 1000MB
/SMRT cell
$0.13 - $0.60
http://www.hindawi.com/journals/bmri/2012/251364/
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431227
http://omicsmaps.com/
Next Generation Genomics:
World Map of High-throughput Sequencers
BTI PGRP SummerInternshipProgram20156/11/2015 38
6/11/2015 BTI PGRP SummerInternshipProgram2015 39
https://flxlexblog.wordpress.com/2014/06/11/developments-in-next-generation-sequencing-june-2014-edition/
6/11/2015 BTI PGRP SummerInternshipProgram2015 40
https://flxlexblog.wordpress.com/2014/06/11/developments-in-next-generation-sequencing-june-2014-edition/
Real cost of Sequencing!!
Sboner,Genome Biology,2011
6/11/2015 41BTI PGRP SummerInternshipProgram2015
Sequencing Data and Concepts
6/11/2015 BTI PGRP SummerInternshipProgram2015 42
Library Types
Single end
Pair end (PE, 150-800 bp, Fwd:/1,Rev:/2)
Mate pair (MP, 2Kb to 20 Kb)
6/11/2015 43
F
F R
F R 454/Roche
FR Illumina
Illumina
Slide credit:AurelianoBombarely
BTI PGRP SummerInternshipProgram2015
Implications of Choice of Library
6/11/2015 44
Slide credit:AurelianoBombarely
Consensus sequence
(Contig)
Reads
Scaffold
(or Supercontig)
Pair Read information
NNNNN
Pseudomolecule
(or ultracontig)
F
Genetic information (markers) or Optical maps
NNNNN NN
BTI PGRP SummerInternshipProgram2015
Multiplexing Libraries
Use of different tags (4-6 nucleotides) to identify
different samples in the same lane/sector.
6/11/2015 45
Slide credit:AurelianoBombarely
AGTCGT
TGAGCA
AGTCGT
AGTCGT
AGTCGT
AGTCGT
TGAGCA
TGAGCA
TGAGCA
TGAGCA
AGTCGT
AGTCGT
AGTCGT
AGTCGT
TGAGCA
TGAGCA
TGAGCA
TGAGCA
Sequencing
BTI PGRP SummerInternshipProgram2015
Fasta files:
It is a text-based format for representing either nucleotide sequences or peptide
sequences, in which nucleotidesor amino acids are represented using single-lettercodes.
-Wikipedia
File Formats
6/11/2015 46
Slide credit:AurelianoBombarely
BTI PGRP SummerInternshipProgram2015
Fastq files:
FASTQ format is a text-based format for storing both a biologicalsequence (usually
nucleotidesequence) and its corresponding qualityscores.
-Wikipedia
• Single line ID with at symbol (“@”) in the first column.
• Sequences can be in multiple lines after the ID line
• Single line with plus symbol (“+”) in the first column to represent the quality line.
• Quality ID line may contain ID
• Quality values are in multiple lines after the + line but length is identical to sequence
6/11/2015 47
Slide credit:AurelianoBombarely
File Formats
BTI PGRP SummerInternshipProgram2015
6/11/2015 48
Quality control: Encoding
Fastq files:
!"#$%&'()*+,-./0123456789 Offset by 33 (Phred+33)
KLMNOPQRSTUVWXYZ[]^_`abcdefgh Offset by 64 (Phred+64)
BTI PGRP SummerInternshipProgram2015
Quality control: Encoding
6/11/2015 49
!"#$%&'()*+,-./0123456789 Offset by 33 (Phred+33)
KLMNOPQRSTUVWXYZ[]^_`abcdefgh Offset by 64 (Phred+64)
BTI PGRP SummerInternshipProgram2015
6/11/2015 50
Quality control: Encoding
http://en.wikipedia.org/wiki/Phred_quality_score
Phred score of a base is:
Qphred = -10 log10 (e)
where e is the estimated probabilityof a base
being wrong
BTI PGRP SummerInternshipProgram2015
Pre-processing: Tools
Trimming
• FastQC
• FASTX toolkit
• Trimmomatic
• Scythe
Joining paired-end reads
• fastq-join
• FLASH
• PANDAseq
6/11/2015 51BTI PGRP SummerInternshipProgram2015
Sequencing done!
Now What??
6/11/2015 BTI PGRP SummerInternshipProgram2015 52
Sequencing done! Now What??
• 1 Hiseq run can produce up to 1500GB or 1.5TB
of data
• How much is 250GB of data?
– 250,000,000,000 characters
– 3000 characters per sheet
– 100 sheets / cm
– Stack of ~8000m
6/11/2015 BTI PGRP SummerInternshipProgram2015 53
Mount Everest - 8848m
Increase in Sequencing Data
L. Stein,Genome Biology,2010
6/11/2015 54
Slide credit:LukasMueller
BTI PGRP SummerInternshipProgram2015
Big Data
6/11/2015 55
Slide credit:LukasMueller
BTI PGRP SummerInternshipProgram2015
High Performance
Computing
Powerful servers with large
amounts of memory,
compute cores, and disk
6/11/2015 56BTI PGRP SummerInternshipProgram2015
What is bioinformatics?
 Bioinformatics /baɪ.oʊˌɪnfərˈmætɪks/is the
applicationof computer science and
information technology to the field of biology
and medicine.
6/11/2015 57
Slide credit:LukasMueller
BTI PGRP SummerInternshipProgram2015
Bioinformatics deals with
 Algorithms, databases and information systems, web
technologies, artificial intelligence and soft computing,
information and computation theory, software
engineering, data mining, image processing, modeling
and simulation, signal processing, discrete mathematics,
control and system theory, circuit theory, and statistics.
 Generation of new knowledge in biology and medicine,
and improving & discovering new models of computation
(e.g. DNA computing, neural computing, evolutionary
computing, immuno-computing, swarm-computing,
cellular-computing).
6/11/2015 58
Slide credit:LukasMueller
BTI PGRP SummerInternshipProgram2015
Bioinformatics can...
 Identify similar sequences
 Provide a putative function for a sequence
 Assemble sequences (genomes, transcriptomes)
 Annotate genomes
 Identify differentially expressed genes
 Build networks of genes or metabolites
 Determine phylogenetic relationships
 Mine literature for biological information
 Uncover differences between two genomes
 Calculate how a protein folds
6/11/2015 59
Slide credit:LukasMueller
BTI PGRP SummerInternshipProgram2015
What can bioinformatics do for me?
 Majority of projects involve large datasets
 Speed up your research
 Enable you to ask new questions
 Basic knowledge of bioinformatics needed
 Extract information
 Transform information
 Run analyses
 Build hypotheses, etc.
6/11/2015 60
Slide credit:LukasMueller
BTI PGRP SummerInternshipProgram2015
6/11/2015 BTI PGRP SummerInternshipProgram2015 61
Linux
 UNIX-based, free and open source
operating system
 Very stable, easy to use
 Created by Linus Torvalds in 1990s
as a student
 Adopted for most bioinformatics
work
 Also: installed on cell phones,
laptops, desktops,clusters,
supercomputers
 Can run on your computer!
 Virtualized or native
http://www.linux-netbook.com/linux/distributions/
6/11/2015 62
Slide credit:LukasMueller
BTI PGRP SummerInternshipProgram2015
Linux
 UNIX-based,free and open source operating
system
 Very stable, easy to use
 Created by Linus Torvalds in 1990s as a student
 Adopted for most bioinformaticswork
 Also: installed on cell phones, laptops, desktops,
clusters, supercomputers
 Can run on your computer!
 Virtualized or native
6/11/2015 63BTI PGRP SummerInternshipProgram2015
Further Reading
Plant Bioinformatics Course
• Virtual machine setup instructions
• Slides for Linux, Sequencing, RNAseq, NGS Read
Mapping and R graphics
• http://btiplantbioinfocourse.wordpress.com
• 6/11/2015 64
Slide credit:LukasMueller
BTI PGRP SummerInternshipProgram2015
Scripting
 Scripts: Small programs written by the end-
user that control the execution of other
programs or perform a simple algorithm
 Extremely flexible
 Written in Shell, Perl, Python
 You can write them yourself!!!
6/11/2015 65
Slide credit:LukasMueller
BTI PGRP SummerInternshipProgram2015
Perl
 Developed since 1980s by Larry Wall
 Useful for bioinformatics and web development
 Support for objects
 Excellent integration of regular expressions (text
handling language)
 Vast open source code library (http:/cpan.org/)
 BioPerl (http://bioperl.org/)
 Easy to learn
 http://www.perl.org/
6/11/2015 66
Slide credit:LukasMueller
BTI PGRP SummerInternshipProgram2015
Python
 Created by Guido van
Rossum in 1989
 Very elegant language
 BioPython libraries
 The “new” popular
language
 Many frameworks(Django
for web etc.)
6/11/2015 67
Slide credit:LukasMueller
BTI PGRP SummerInternshipProgram2015
 Language designed for statistics
 Support for matrix calculations, graphics
 Expression analysis, Next-Gen sequence analysis,
Graphics, genome annotation statistics, phylogeny
 Interactive
6/11/2015 68
Slide credit:LukasMueller
BTI PGRP SummerInternshipProgram2015
Databases
 Need to store and query
data
 Biological data is highly
structured
 Relational database
systems
 Non-relationalsystems
6/11/2015 69
Slide credit:LukasMueller
BTI PGRP SummerInternshipProgram2015
Thank you!!
6/11/2015 BTI PGRP SummerInternshipProgram2015 70

More Related Content

What's hot

Next-generation sequencing from 2005 to 2020
Next-generation sequencing from 2005 to 2020Next-generation sequencing from 2005 to 2020
Next-generation sequencing from 2005 to 2020Christian Frech
 
Sequencing: The Next Generation
Sequencing: The Next GenerationSequencing: The Next Generation
Sequencing: The Next GenerationSurya Saha
 
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...Joseph Hughes
 
Next-generation sequencing and quality control: An Introduction (2016)
Next-generation sequencing and quality control: An Introduction (2016)Next-generation sequencing and quality control: An Introduction (2016)
Next-generation sequencing and quality control: An Introduction (2016)Sebastian Schmeier
 
Next-generation genomics: an integrative approach
Next-generation genomics: an integrative approachNext-generation genomics: an integrative approach
Next-generation genomics: an integrative approachHong ChangBum
 
Genome walking – a new strategy for identification of nucleotide sequence in ...
Genome walking – a new strategy for identification of nucleotide sequence in ...Genome walking – a new strategy for identification of nucleotide sequence in ...
Genome walking – a new strategy for identification of nucleotide sequence in ...Dr. Mukesh Chavan
 
Toolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGSToolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGSMirko Rossi
 
Thoughts on the recent announcements by Oxford Nanopore Technologies
Thoughts on the recent announcements by Oxford Nanopore TechnologiesThoughts on the recent announcements by Oxford Nanopore Technologies
Thoughts on the recent announcements by Oxford Nanopore TechnologiesKeith Bradnam
 
New High Throughput Sequencing technologies at the Norwegian Sequencing Centr...
New High Throughput Sequencing technologies at the Norwegian Sequencing Centr...New High Throughput Sequencing technologies at the Norwegian Sequencing Centr...
New High Throughput Sequencing technologies at the Norwegian Sequencing Centr...Lex Nederbragt
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Li Shen
 
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...EMC
 
CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015Richard Casey
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Prof. Wim Van Criekinge
 
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...Thermo Fisher Scientific
 
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...Thermo Fisher Scientific
 
BioChain Next Generation Sequencing Products
BioChain Next Generation Sequencing ProductsBioChain Next Generation Sequencing Products
BioChain Next Generation Sequencing Productsbiochain
 
The Next, Next Generation of Sequencing - From Semiconductor to Single Molecule
The Next, Next Generation of Sequencing - From Semiconductor to Single MoleculeThe Next, Next Generation of Sequencing - From Semiconductor to Single Molecule
The Next, Next Generation of Sequencing - From Semiconductor to Single MoleculeJustin Johnson
 

What's hot (20)

Next-generation sequencing from 2005 to 2020
Next-generation sequencing from 2005 to 2020Next-generation sequencing from 2005 to 2020
Next-generation sequencing from 2005 to 2020
 
Sequencing: The Next Generation
Sequencing: The Next GenerationSequencing: The Next Generation
Sequencing: The Next Generation
 
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
 
Next-generation sequencing and quality control: An Introduction (2016)
Next-generation sequencing and quality control: An Introduction (2016)Next-generation sequencing and quality control: An Introduction (2016)
Next-generation sequencing and quality control: An Introduction (2016)
 
Next-generation genomics: an integrative approach
Next-generation genomics: an integrative approachNext-generation genomics: an integrative approach
Next-generation genomics: an integrative approach
 
Genome walking – a new strategy for identification of nucleotide sequence in ...
Genome walking – a new strategy for identification of nucleotide sequence in ...Genome walking – a new strategy for identification of nucleotide sequence in ...
Genome walking – a new strategy for identification of nucleotide sequence in ...
 
Ngs part i 2013
Ngs part i 2013Ngs part i 2013
Ngs part i 2013
 
Toolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGSToolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGS
 
Thoughts on the recent announcements by Oxford Nanopore Technologies
Thoughts on the recent announcements by Oxford Nanopore TechnologiesThoughts on the recent announcements by Oxford Nanopore Technologies
Thoughts on the recent announcements by Oxford Nanopore Technologies
 
Hamas 1
Hamas 1Hamas 1
Hamas 1
 
New High Throughput Sequencing technologies at the Norwegian Sequencing Centr...
New High Throughput Sequencing technologies at the Norwegian Sequencing Centr...New High Throughput Sequencing technologies at the Norwegian Sequencing Centr...
New High Throughput Sequencing technologies at the Norwegian Sequencing Centr...
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2
 
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
 
CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
 
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
 
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
 
BioChain Next Generation Sequencing Products
BioChain Next Generation Sequencing ProductsBioChain Next Generation Sequencing Products
BioChain Next Generation Sequencing Products
 
The Next, Next Generation of Sequencing - From Semiconductor to Single Molecule
The Next, Next Generation of Sequencing - From Semiconductor to Single MoleculeThe Next, Next Generation of Sequencing - From Semiconductor to Single Molecule
The Next, Next Generation of Sequencing - From Semiconductor to Single Molecule
 
Ngs intro_v6_public
 Ngs intro_v6_public Ngs intro_v6_public
Ngs intro_v6_public
 

Viewers also liked

الفيزياء الحيوية2
الفيزياء الحيوية2الفيزياء الحيوية2
الفيزياء الحيوية2Biophysics2014
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsDuncan Hull
 
ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO MOLECULES
ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO  MOLECULES ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO  MOLECULES
ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO MOLECULES Arunima Sur
 
B.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignmentB.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignmentRai University
 
Biophysics -diffusion,osmosis,osmotic pressure,dialysis
Biophysics -diffusion,osmosis,osmotic pressure,dialysisBiophysics -diffusion,osmosis,osmotic pressure,dialysis
Biophysics -diffusion,osmosis,osmotic pressure,dialysisDr.Rittu Chandel MBBS, MD
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsNikesh Narayanan
 
Bioinformatics
BioinformaticsBioinformatics
BioinformaticsJTADrexel
 

Viewers also liked (8)

Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
الفيزياء الحيوية2
الفيزياء الحيوية2الفيزياء الحيوية2
الفيزياء الحيوية2
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
 
ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO MOLECULES
ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO  MOLECULES ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO  MOLECULES
ANALYTICAL TECHNIQUES IN BIOCHEMISTRY AND BIOPHYSICS FOR MACRO MOLECULES
 
B.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignmentB.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignment
 
Biophysics -diffusion,osmosis,osmotic pressure,dialysis
Biophysics -diffusion,osmosis,osmotic pressure,dialysisBiophysics -diffusion,osmosis,osmotic pressure,dialysis
Biophysics -diffusion,osmosis,osmotic pressure,dialysis
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 

Similar to Sequencing and Bioinformatics PGRP Summer 2015

ScilabTEC 2015 - Bavarian Center for Agriculture
ScilabTEC 2015 - Bavarian Center for AgricultureScilabTEC 2015 - Bavarian Center for Agriculture
ScilabTEC 2015 - Bavarian Center for AgricultureScilab
 
Cool Genes: The Search for a Cure Using Genomics, Big Data, and Docker - Jame...
Cool Genes: The Search for a Cure Using Genomics, Big Data, and Docker - Jame...Cool Genes: The Search for a Cure Using Genomics, Big Data, and Docker - Jame...
Cool Genes: The Search for a Cure Using Genomics, Big Data, and Docker - Jame...Docker, Inc.
 
Protein-Protein Interaction Prediction Based on Template-Based and de Novo Do...
Protein-Protein Interaction Prediction Based on Template-Based and de Novo Do...Protein-Protein Interaction Prediction Based on Template-Based and de Novo Do...
Protein-Protein Interaction Prediction Based on Template-Based and de Novo Do...Masahito Ohue
 
The Life of a Packet through Istio - DevExperience Romania, April 2019
The Life of a Packet through Istio - DevExperience Romania, April 2019The Life of a Packet through Istio - DevExperience Romania, April 2019
The Life of a Packet through Istio - DevExperience Romania, April 2019Matt Turner
 
How we've made a global search engine for genetic data
How we've made a global search engine for genetic dataHow we've made a global search engine for genetic data
How we've made a global search engine for genetic dataMiro Cupak
 
ICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartAraport
 
Oscillation Compensating Dynamic Adaptive Streaming over HTTP
Oscillation Compensating Dynamic Adaptive Streaming over HTTPOscillation Compensating Dynamic Adaptive Streaming over HTTP
Oscillation Compensating Dynamic Adaptive Streaming over HTTPAlpen-Adria-Universität
 
fastp: the FASTQ pre-processor
fastp: the FASTQ pre-processorfastp: the FASTQ pre-processor
fastp: the FASTQ pre-processorHoffman Lab
 
Workflows supporting drug discovery against malaria
Workflows supporting drug discovery against malariaWorkflows supporting drug discovery against malaria
Workflows supporting drug discovery against malariaBarry Hardy
 
Platforms for mAb Commercialization
Platforms for mAb Commercialization Platforms for mAb Commercialization
Platforms for mAb Commercialization KBI Biopharma
 
An open source framework for processing daily satellite images (AVHRR) over l...
An open source framework for processing daily satellite images (AVHRR) over l...An open source framework for processing daily satellite images (AVHRR) over l...
An open source framework for processing daily satellite images (AVHRR) over l...Sajid Pareeth
 
CIP Genebank IT Platform
CIP Genebank IT PlatformCIP Genebank IT Platform
CIP Genebank IT PlatformEdwin Rojas
 

Similar to Sequencing and Bioinformatics PGRP Summer 2015 (20)

Sequencing
SequencingSequencing
Sequencing
 
Omprn 2018 module1_final
Omprn 2018 module1_finalOmprn 2018 module1_final
Omprn 2018 module1_final
 
Imt Assays 05 19 2010
Imt Assays 05 19 2010Imt Assays 05 19 2010
Imt Assays 05 19 2010
 
ScilabTEC 2015 - Bavarian Center for Agriculture
ScilabTEC 2015 - Bavarian Center for AgricultureScilabTEC 2015 - Bavarian Center for Agriculture
ScilabTEC 2015 - Bavarian Center for Agriculture
 
Cool Genes: The Search for a Cure Using Genomics, Big Data, and Docker - Jame...
Cool Genes: The Search for a Cure Using Genomics, Big Data, and Docker - Jame...Cool Genes: The Search for a Cure Using Genomics, Big Data, and Docker - Jame...
Cool Genes: The Search for a Cure Using Genomics, Big Data, and Docker - Jame...
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformatics
 
Protein-Protein Interaction Prediction Based on Template-Based and de Novo Do...
Protein-Protein Interaction Prediction Based on Template-Based and de Novo Do...Protein-Protein Interaction Prediction Based on Template-Based and de Novo Do...
Protein-Protein Interaction Prediction Based on Template-Based and de Novo Do...
 
The Life of a Packet through Istio - DevExperience Romania, April 2019
The Life of a Packet through Istio - DevExperience Romania, April 2019The Life of a Packet through Istio - DevExperience Romania, April 2019
The Life of a Packet through Istio - DevExperience Romania, April 2019
 
How we've made a global search engine for genetic data
How we've made a global search engine for genetic dataHow we've made a global search engine for genetic data
How we've made a global search engine for genetic data
 
Cancer uk 2015_module1_ouellette_ver02
Cancer uk 2015_module1_ouellette_ver02Cancer uk 2015_module1_ouellette_ver02
Cancer uk 2015_module1_ouellette_ver02
 
ICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick Provart
 
The CRISPR/Cas9 Toolbox
The CRISPR/Cas9 ToolboxThe CRISPR/Cas9 Toolbox
The CRISPR/Cas9 Toolbox
 
05 costa
05 costa05 costa
05 costa
 
Oscillation Compensating Dynamic Adaptive Streaming over HTTP
Oscillation Compensating Dynamic Adaptive Streaming over HTTPOscillation Compensating Dynamic Adaptive Streaming over HTTP
Oscillation Compensating Dynamic Adaptive Streaming over HTTP
 
Càlcul i anàlisi de dades massiu per al disseny d'enzims amb aplicacions a la...
Càlcul i anàlisi de dades massiu per al disseny d'enzims amb aplicacions a la...Càlcul i anàlisi de dades massiu per al disseny d'enzims amb aplicacions a la...
Càlcul i anàlisi de dades massiu per al disseny d'enzims amb aplicacions a la...
 
fastp: the FASTQ pre-processor
fastp: the FASTQ pre-processorfastp: the FASTQ pre-processor
fastp: the FASTQ pre-processor
 
Workflows supporting drug discovery against malaria
Workflows supporting drug discovery against malariaWorkflows supporting drug discovery against malaria
Workflows supporting drug discovery against malaria
 
Platforms for mAb Commercialization
Platforms for mAb Commercialization Platforms for mAb Commercialization
Platforms for mAb Commercialization
 
An open source framework for processing daily satellite images (AVHRR) over l...
An open source framework for processing daily satellite images (AVHRR) over l...An open source framework for processing daily satellite images (AVHRR) over l...
An open source framework for processing daily satellite images (AVHRR) over l...
 
CIP Genebank IT Platform
CIP Genebank IT PlatformCIP Genebank IT Platform
CIP Genebank IT Platform
 

More from Surya Saha

An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...Surya Saha
 
Functional annotation of invertebrate genomes
Functional annotation of invertebrate genomesFunctional annotation of invertebrate genomes
Functional annotation of invertebrate genomesSurya Saha
 
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Surya Saha
 
Updates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meetingUpdates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meetingSurya Saha
 
Updates on the ACP v3 genome and annotation from USDA NIFA project meeting
Updates on the ACP v3 genome and annotation from USDA NIFA project meetingUpdates on the ACP v3 genome and annotation from USDA NIFA project meeting
Updates on the ACP v3 genome and annotation from USDA NIFA project meetingSurya Saha
 
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant DiseasesAgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant DiseasesSurya Saha
 
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...Surya Saha
 
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...Surya Saha
 
Quality Control of Sequencing Data
Quality Control of Sequencing Data Quality Control of Sequencing Data
Quality Control of Sequencing Data Surya Saha
 
Community resources for all y’all Omics
Community resources for all y’all OmicsCommunity resources for all y’all Omics
Community resources for all y’all OmicsSurya Saha
 
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
 CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis... CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...Surya Saha
 
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...Surya Saha
 
Tomato Genome Build SL3.0
Tomato Genome Build SL3.0Tomato Genome Build SL3.0
Tomato Genome Build SL3.0Surya Saha
 
Quality Control of Sequencing Data
Quality Control of Sequencing DataQuality Control of Sequencing Data
Quality Control of Sequencing DataSurya Saha
 
Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…Surya Saha
 
Quality Control of NGS Data
Quality Control of NGS Data Quality Control of NGS Data
Quality Control of NGS Data Surya Saha
 
Quality Control of NGS Data Solutions
Quality Control of NGS Data  SolutionsQuality Control of NGS Data  Solutions
Quality Control of NGS Data SolutionsSurya Saha
 
Sequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN PlatformSequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN PlatformSurya Saha
 
ICAR Soybean Indore 2014
ICAR Soybean Indore 2014ICAR Soybean Indore 2014
ICAR Soybean Indore 2014Surya Saha
 
Mining Eukaryotic Meta-Genomes for Endosymbionts using Next-Generation Sequen...
Mining Eukaryotic Meta-Genomes for Endosymbionts using Next-Generation Sequen...Mining Eukaryotic Meta-Genomes for Endosymbionts using Next-Generation Sequen...
Mining Eukaryotic Meta-Genomes for Endosymbionts using Next-Generation Sequen...Surya Saha
 

More from Surya Saha (20)

An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...
 
Functional annotation of invertebrate genomes
Functional annotation of invertebrate genomesFunctional annotation of invertebrate genomes
Functional annotation of invertebrate genomes
 
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
 
Updates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meetingUpdates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meeting
 
Updates on the ACP v3 genome and annotation from USDA NIFA project meeting
Updates on the ACP v3 genome and annotation from USDA NIFA project meetingUpdates on the ACP v3 genome and annotation from USDA NIFA project meeting
Updates on the ACP v3 genome and annotation from USDA NIFA project meeting
 
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant DiseasesAgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
 
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...
 
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
 
Quality Control of Sequencing Data
Quality Control of Sequencing Data Quality Control of Sequencing Data
Quality Control of Sequencing Data
 
Community resources for all y’all Omics
Community resources for all y’all OmicsCommunity resources for all y’all Omics
Community resources for all y’all Omics
 
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
 CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis... CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
 
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
 
Tomato Genome Build SL3.0
Tomato Genome Build SL3.0Tomato Genome Build SL3.0
Tomato Genome Build SL3.0
 
Quality Control of Sequencing Data
Quality Control of Sequencing DataQuality Control of Sequencing Data
Quality Control of Sequencing Data
 
Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…
 
Quality Control of NGS Data
Quality Control of NGS Data Quality Control of NGS Data
Quality Control of NGS Data
 
Quality Control of NGS Data Solutions
Quality Control of NGS Data  SolutionsQuality Control of NGS Data  Solutions
Quality Control of NGS Data Solutions
 
Sequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN PlatformSequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN Platform
 
ICAR Soybean Indore 2014
ICAR Soybean Indore 2014ICAR Soybean Indore 2014
ICAR Soybean Indore 2014
 
Mining Eukaryotic Meta-Genomes for Endosymbionts using Next-Generation Sequen...
Mining Eukaryotic Meta-Genomes for Endosymbionts using Next-Generation Sequen...Mining Eukaryotic Meta-Genomes for Endosymbionts using Next-Generation Sequen...
Mining Eukaryotic Meta-Genomes for Endosymbionts using Next-Generation Sequen...
 

Recently uploaded

microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 

Recently uploaded (20)

microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 

Sequencing and Bioinformatics PGRP Summer 2015

  • 1. Surya Saha Sol Genomics Network (SGN) Boyce Thompson Institute, Ithaca, NY ss2489@cornell.edu // @SahaSurya BTI PGRP Intership Program 2015 http://www.acgt.me/blog/2015/3/7/next-generation-sequencing-must-die
  • 2. Hello Experiment! • Experimental design for survey Sample size Locations Phenotypes 6/11/2015 BTI PGRP SummerInternshipProgram2015 2 Early Blight infected tomato plants http://www.longislandhort.cornell.edu/vegpath/photos/early_blight.htm
  • 3. Hello Experiment! • Experimental design for survey Sample size Locations Phenotypes • Experimental design to identify genetic differences PCR-based • Simple Sequence Repeats • Other markers Sequencing-based • Genes of interest • Single Nucleotide Polymorphisms • Gene expression • Genotyping by Sequencing 6/11/2015 BTI PGRP SummerInternshipProgram2015 3 Early Blight infected tomato plants http://www.longislandhort.cornell.edu/vegpath/photos/early_blight.htm
  • 4. Why Sequencing? • Targeted interrogation of genome • Economical • Technological developments • High-throughput assays • But requires subsequent validation 6/11/2015 BTI PGRP SummerInternshipProgram2015 4
  • 5. Why Sequencing? • Targeted interrogation of genome • Economical • Technological developments • High-throughput assays • But requires subsequent validation 6/11/2015 BTI PGRP SummerInternshipProgram2015 5
  • 6. 1953 DNA Structure discovery 1977 2012 Sanger DNA sequencing by chain-terminating inhibitors 1984 Epstein-Barr virus (170 Kb) 1987 Abi370 Sequencer 1995 2001 Homo sapiens (3.0 Gb) 2005 454 Solexa Solid 2007 2011 Ion Torrent PacBio Haemophilus influenzae (1.83 Mb) 2013 Slide designcredit: AurelianoBombarely Sequencing: Then and Now Illumina Illumina Hiseq X 454 6/11/2015 BTI PGRP SummerInternshipProgram2015 6 Pinus taeda (24 Gb) 2014 Nanopore MinION
  • 7. First generation sequencing 6/11/2015 BTI PGRP SummerInternshipProgram2015 7 Sanger. Annu Rev Biochem. 1988;57:1-28. Thanks to Nick Loman for the mention
  • 8. Maxam-Gilbert method (1973) 6/11/2015 BTI PGRP SummerInternshipProgram2015 8
  • 9. Maxam-Gilbert method (1973) 6/11/2015 BTI PGRP SummerInternshipProgram2015 9 http://en.wikipedia.org/wiki/File:Maxam- Gilbert_sequencing_en.svg https://www.nationaldiagnostics.com/electrophoresis /article/maxam-gilbert-sequencing
  • 10. Sanger method (1977) 6/11/2015 BTI PGRP SummerInternshipProgram2015 10 Frederick Sanger 13 Aug 1918 – 19 Nov 2013 Won the Nobel Prize for Chemistry in 1958 and 1980. Published the dideoxy chain termination method or “Sanger method” in 1977 http://dailym.ai/1f1XeTB
  • 11. Sanger method (1977) 6/11/2015 BTI PGRP SummerInternshipProgram2015 11 http://en.wikipedia.org/wiki/File:Sanger-sequencing.svg http://en.wikipedia.org/wiki/File: Radioactive_Fluorescent_Seq.jpg
  • 12. First generation sequencing • Very high quality sequences (99.999% or Q50) • Very low throughput 6/11/2015 BTI PGRP SummerInternshipProgram2015 12 Run Time Read Length Reads / Run Total nucleotides sequenced Cost / MB Capillary Sequencing (ABI3730xl) 20m-3h 400-900 bp 96 or 384 1.9-84 Kb $2400 http://www.hindawi.com/journals/bmri/2012/251364/tab1/
  • 13. Next generation sequencing 6/11/2015 BTI PGRP SummerInternshipProgram2015 13
  • 14. 6/11/2015 BTI PGRP SummerInternshipProgram2015 14 https://twitter.com/kbradnam/status/443153578429923328 • Second generation • Third generation • Fourth generation • Next-next-generation • Next-next-next generation http://www.acgt.me/blog/2015/3/10/next-generation-sequencing-must-diepart-2
  • 15. Mention the specific technology used to generate the data – Illumina Hiseq/Miseq/NextSeq – Pacific Biosciences RS1/RSII – Ion Torrent Proton/PGM – SOLiD – Oxford Nanopore Minion 6/11/2015 BTI PGRP SummerInternshipProgram2015 15 http://www.acgt.me/blog/2015/3/10/next-generation-sequencing-must- diepart-2
  • 16. 454 Pyrosequencing One purified DNA fragment, to one bead, to one read. 6/11/2015 BTI PGRP SummerInternshipProgram2015 16 http://www.genengnews.com/ GS FLX Titanium https://mariamuir.com/wp- content/uploads/2013/04/rip.gif
  • 17. Illumina 6/11/2015 BTI PGRP SummerInternshipProgram2015 17 Output 0.3-15 Gb 20-120 GB 10-1500 GB 900-1800GB Number of Reads/ Flow cell 25 Million 130-400 Million 300 million – 2.5 Billion 3 Billion Read Length 2x300 bp 2x150 bp 2x250 - 2x125 bp 2x150 bp Cost $99K $250K $740K $10M(10 units) Source:Illumina 2500 3000 4000 500
  • 18. Illumina 6/11/2015 BTI PlantBioinformaticsCourse 2015 18 Output 0.3-15 Gb 20-120 GB 10-1500 GB 900-1800GB Number of Reads/ Flow cell 25 Million 130-400 Million 300 million – 2.5 Billion 3 Billion Read Length 2x300 bp 2x150 bp 2x250 - 2x125 bp 2x150 bp Cost $99K $250K $740K $10M(10 units) Source:Illumina 2500 3000 4000 $1000 human genome?? 500
  • 19. Illumina 6/11/2015 BTI PGRP SummerInternshipProgram2015 19 Mardis 2008. Annu. Rev. Genomics Hum. Genet. 2008. 9:387–402
  • 20. Illumina 6/11/2015 BTI PGRP SummerInternshipProgram2015 20 Mardis 2008. Annu. Rev. Genomics Hum. Genet. 2008. 9:387–402
  • 21. Illumina:TruSeqLongRead 6/11/2015 BTI PGRP SummerInternshipProgram2015 21 Voskoboynik eLife2013;2:e00569
  • 22. Pacific Biosciences SMRT sequencing Single Molecule Real Time sequencing 6/11/2015 BTI PGRP SummerInternshipProgram2015 22 http://smrt.med.cornell.edu/images/pacbio_library_prep-1.gif
  • 23. Pacific Biosciences SMRT sequencing Error correction methods 6/11/2015 BTI PGRP SummerInternshipProgram2015 23 Hierarchical genome-assembly process (HGAP) Englishetal., PLOSOne.2012 PBJelly
  • 24. Pacific Biosciences SMRT sequencing Error correction methods 6/11/2015 BTI PGRP SummerInternshipProgram2015 24 PBcRPipeline
  • 25. 6/11/2015 BTI PGRP SummerInternshipProgram2015 25 Pacific Biosciences SMRT sequencing Read Lengths http://www.igs.umaryland.edu/labs/grc/ Mean Read Length: 8391 bp Maximum Subread Length: 24585 bp
  • 26. 6/11/2015 BTI PGRP SummerInternshipProgram2015 26 Pacific Biosciences SMRT sequencing Read Lengths
  • 27. Genome Assembly with Long Reads 6/11/2015 BTI PGRP SummerInternshipProgram2015 27
  • 28. Oxford Nanopore 6/11/2015 BTI PGRP SummerInternshipProgram2015 28 https://www.nanoporetech.com/ http://erlichya.tumblr.com/post/66376172948/hands-on- experience-with-oxford-nanopore-minion http://halegrafx.com/vector-art/free-vector-despicable-me-minions/
  • 29. Oxford Nanopore 6/11/2015 BTI PGRP SummerInternshipProgram2015 29
  • 30. Oxford Nanopore 6/11/2015 BTI PGRP SummerInternshipProgram2015 30 https://theconversation.com/how-a-small-backpack-for-fast-genomic-sequencing-is-helping- combat-ebola-41863
  • 31. 6/11/2015 BTI PGRP SummerInternshipProgram2015 31
  • 32. Sequencing Trends 6/11/2015 BTI PGRP SummerInternshipProgram2015 32 https://www.google.com/trends/
  • 33. 6/11/2015 BTI PGRP SummerInternshipProgram2015 33 0 5000 10000 15000 20000 25000 30000 2008 2009 2010 2011 2012 2013 2014 Number of Publications Illumina Pacific Biosciences Roche 454 Ion Torrent -2000 -1000 0 1000 2000 3000 4000 5000 6000 2009 2010 2011 2012 2013 2014 Increasein Number of Publications Illumina Pacific Biosciences Roche 454 Ion Torrent 0.00% 20.00% 40.00% 60.00% 80.00% 100.00% 120.00% 2009 2010 2011 2012 2013 2014 % Increasein Number of Publications Pacific Biosciences Roche 454 Ion Torrent
  • 34. Hi-C Crosslinking 6/11/2015 BTI PGRP SummerInternshipProgram2015 34
  • 35. Others • Ion Torrent Proton/PGM • SOLiD • Helicos • Supporting technologies – BioNano – Nabsys – OpGen – 10X Genomics – Fluidigm 6/11/2015 BTI PGRP SummerInternshipProgram2015 35
  • 36. Comparison 6/11/2015 BTI PGRP SummerInternshipProgram2015 36
  • 37. Next generation sequencing 6/11/2015 BTI PGRP SummerInternshipProgram2015 37 Run Time Read Length Quality Total nucleotides sequenced Cost/MB 454 Pyrosequencing 24h 700 bp Q20-Q30 1 GB $10 Illumina Miseq 27h 2x300bp > Q30 15 GB $0.15 Illumina Hiseq 2500 1 - 10days 2x250bp >Q30 3000 GB $0.05 Ion torrent 2h 400bp >Q20 50MB-1GB $1 Pacific Biosciences 30m - 4h 10kb - >40kb >Q50 consensus >Q10 single 500 - 1000MB /SMRT cell $0.13 - $0.60 http://www.hindawi.com/journals/bmri/2012/251364/ http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431227
  • 38. http://omicsmaps.com/ Next Generation Genomics: World Map of High-throughput Sequencers BTI PGRP SummerInternshipProgram20156/11/2015 38
  • 39. 6/11/2015 BTI PGRP SummerInternshipProgram2015 39 https://flxlexblog.wordpress.com/2014/06/11/developments-in-next-generation-sequencing-june-2014-edition/
  • 40. 6/11/2015 BTI PGRP SummerInternshipProgram2015 40 https://flxlexblog.wordpress.com/2014/06/11/developments-in-next-generation-sequencing-june-2014-edition/
  • 41. Real cost of Sequencing!! Sboner,Genome Biology,2011 6/11/2015 41BTI PGRP SummerInternshipProgram2015
  • 42. Sequencing Data and Concepts 6/11/2015 BTI PGRP SummerInternshipProgram2015 42
  • 43. Library Types Single end Pair end (PE, 150-800 bp, Fwd:/1,Rev:/2) Mate pair (MP, 2Kb to 20 Kb) 6/11/2015 43 F F R F R 454/Roche FR Illumina Illumina Slide credit:AurelianoBombarely BTI PGRP SummerInternshipProgram2015
  • 44. Implications of Choice of Library 6/11/2015 44 Slide credit:AurelianoBombarely Consensus sequence (Contig) Reads Scaffold (or Supercontig) Pair Read information NNNNN Pseudomolecule (or ultracontig) F Genetic information (markers) or Optical maps NNNNN NN BTI PGRP SummerInternshipProgram2015
  • 45. Multiplexing Libraries Use of different tags (4-6 nucleotides) to identify different samples in the same lane/sector. 6/11/2015 45 Slide credit:AurelianoBombarely AGTCGT TGAGCA AGTCGT AGTCGT AGTCGT AGTCGT TGAGCA TGAGCA TGAGCA TGAGCA AGTCGT AGTCGT AGTCGT AGTCGT TGAGCA TGAGCA TGAGCA TGAGCA Sequencing BTI PGRP SummerInternshipProgram2015
  • 46. Fasta files: It is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotidesor amino acids are represented using single-lettercodes. -Wikipedia File Formats 6/11/2015 46 Slide credit:AurelianoBombarely BTI PGRP SummerInternshipProgram2015
  • 47. Fastq files: FASTQ format is a text-based format for storing both a biologicalsequence (usually nucleotidesequence) and its corresponding qualityscores. -Wikipedia • Single line ID with at symbol (“@”) in the first column. • Sequences can be in multiple lines after the ID line • Single line with plus symbol (“+”) in the first column to represent the quality line. • Quality ID line may contain ID • Quality values are in multiple lines after the + line but length is identical to sequence 6/11/2015 47 Slide credit:AurelianoBombarely File Formats BTI PGRP SummerInternshipProgram2015
  • 48. 6/11/2015 48 Quality control: Encoding Fastq files: !"#$%&'()*+,-./0123456789 Offset by 33 (Phred+33) KLMNOPQRSTUVWXYZ[]^_`abcdefgh Offset by 64 (Phred+64) BTI PGRP SummerInternshipProgram2015
  • 49. Quality control: Encoding 6/11/2015 49 !"#$%&'()*+,-./0123456789 Offset by 33 (Phred+33) KLMNOPQRSTUVWXYZ[]^_`abcdefgh Offset by 64 (Phred+64) BTI PGRP SummerInternshipProgram2015
  • 50. 6/11/2015 50 Quality control: Encoding http://en.wikipedia.org/wiki/Phred_quality_score Phred score of a base is: Qphred = -10 log10 (e) where e is the estimated probabilityof a base being wrong BTI PGRP SummerInternshipProgram2015
  • 51. Pre-processing: Tools Trimming • FastQC • FASTX toolkit • Trimmomatic • Scythe Joining paired-end reads • fastq-join • FLASH • PANDAseq 6/11/2015 51BTI PGRP SummerInternshipProgram2015
  • 52. Sequencing done! Now What?? 6/11/2015 BTI PGRP SummerInternshipProgram2015 52
  • 53. Sequencing done! Now What?? • 1 Hiseq run can produce up to 1500GB or 1.5TB of data • How much is 250GB of data? – 250,000,000,000 characters – 3000 characters per sheet – 100 sheets / cm – Stack of ~8000m 6/11/2015 BTI PGRP SummerInternshipProgram2015 53 Mount Everest - 8848m
  • 54. Increase in Sequencing Data L. Stein,Genome Biology,2010 6/11/2015 54 Slide credit:LukasMueller BTI PGRP SummerInternshipProgram2015
  • 55. Big Data 6/11/2015 55 Slide credit:LukasMueller BTI PGRP SummerInternshipProgram2015
  • 56. High Performance Computing Powerful servers with large amounts of memory, compute cores, and disk 6/11/2015 56BTI PGRP SummerInternshipProgram2015
  • 57. What is bioinformatics?  Bioinformatics /baɪ.oʊˌɪnfərˈmætɪks/is the applicationof computer science and information technology to the field of biology and medicine. 6/11/2015 57 Slide credit:LukasMueller BTI PGRP SummerInternshipProgram2015
  • 58. Bioinformatics deals with  Algorithms, databases and information systems, web technologies, artificial intelligence and soft computing, information and computation theory, software engineering, data mining, image processing, modeling and simulation, signal processing, discrete mathematics, control and system theory, circuit theory, and statistics.  Generation of new knowledge in biology and medicine, and improving & discovering new models of computation (e.g. DNA computing, neural computing, evolutionary computing, immuno-computing, swarm-computing, cellular-computing). 6/11/2015 58 Slide credit:LukasMueller BTI PGRP SummerInternshipProgram2015
  • 59. Bioinformatics can...  Identify similar sequences  Provide a putative function for a sequence  Assemble sequences (genomes, transcriptomes)  Annotate genomes  Identify differentially expressed genes  Build networks of genes or metabolites  Determine phylogenetic relationships  Mine literature for biological information  Uncover differences between two genomes  Calculate how a protein folds 6/11/2015 59 Slide credit:LukasMueller BTI PGRP SummerInternshipProgram2015
  • 60. What can bioinformatics do for me?  Majority of projects involve large datasets  Speed up your research  Enable you to ask new questions  Basic knowledge of bioinformatics needed  Extract information  Transform information  Run analyses  Build hypotheses, etc. 6/11/2015 60 Slide credit:LukasMueller BTI PGRP SummerInternshipProgram2015
  • 61. 6/11/2015 BTI PGRP SummerInternshipProgram2015 61
  • 62. Linux  UNIX-based, free and open source operating system  Very stable, easy to use  Created by Linus Torvalds in 1990s as a student  Adopted for most bioinformatics work  Also: installed on cell phones, laptops, desktops,clusters, supercomputers  Can run on your computer!  Virtualized or native http://www.linux-netbook.com/linux/distributions/ 6/11/2015 62 Slide credit:LukasMueller BTI PGRP SummerInternshipProgram2015
  • 63. Linux  UNIX-based,free and open source operating system  Very stable, easy to use  Created by Linus Torvalds in 1990s as a student  Adopted for most bioinformaticswork  Also: installed on cell phones, laptops, desktops, clusters, supercomputers  Can run on your computer!  Virtualized or native 6/11/2015 63BTI PGRP SummerInternshipProgram2015
  • 64. Further Reading Plant Bioinformatics Course • Virtual machine setup instructions • Slides for Linux, Sequencing, RNAseq, NGS Read Mapping and R graphics • http://btiplantbioinfocourse.wordpress.com • 6/11/2015 64 Slide credit:LukasMueller BTI PGRP SummerInternshipProgram2015
  • 65. Scripting  Scripts: Small programs written by the end- user that control the execution of other programs or perform a simple algorithm  Extremely flexible  Written in Shell, Perl, Python  You can write them yourself!!! 6/11/2015 65 Slide credit:LukasMueller BTI PGRP SummerInternshipProgram2015
  • 66. Perl  Developed since 1980s by Larry Wall  Useful for bioinformatics and web development  Support for objects  Excellent integration of regular expressions (text handling language)  Vast open source code library (http:/cpan.org/)  BioPerl (http://bioperl.org/)  Easy to learn  http://www.perl.org/ 6/11/2015 66 Slide credit:LukasMueller BTI PGRP SummerInternshipProgram2015
  • 67. Python  Created by Guido van Rossum in 1989  Very elegant language  BioPython libraries  The “new” popular language  Many frameworks(Django for web etc.) 6/11/2015 67 Slide credit:LukasMueller BTI PGRP SummerInternshipProgram2015
  • 68.  Language designed for statistics  Support for matrix calculations, graphics  Expression analysis, Next-Gen sequence analysis, Graphics, genome annotation statistics, phylogeny  Interactive 6/11/2015 68 Slide credit:LukasMueller BTI PGRP SummerInternshipProgram2015
  • 69. Databases  Need to store and query data  Biological data is highly structured  Relational database systems  Non-relationalsystems 6/11/2015 69 Slide credit:LukasMueller BTI PGRP SummerInternshipProgram2015
  • 70. Thank you!! 6/11/2015 BTI PGRP SummerInternshipProgram2015 70