SlideShare a Scribd company logo
1 of 42
de Bruijn graphs
De Bruijn graph is a
directed graph
representing overlaps
between sequences of
symbols
What are de Bruijn graphs ?
History
Massiveness of the province
Restricted sources of communication
Language difference
Need of the time
Progress of locals
 De bruijn was a person who invented these graphs
 He was interested in purely abstract problems.
 He tried to solve so-called universal string problem.
 Universal string problem :
Finding a circular or graphical string containing each binary k-mer
exactly once.
Types of De Bruijn Graphs
 Eulerian Path
 Hamiltonian Trail
Information
3 Steps
De Bruijn
Graph
Directed
graph
INPUT/OUTPUT
Input :
A set of k-mers patterns.
Output :
De Bruijn graphs
K-mers
 All the possible subsequences (of
length k) from a read obtained
through DNA Sequencing
Small chunks 001 110
000 101
001110000101
INPUT
Whole Sequence :
If whole sequence is given we have to divide it into k-
mers
K-mers
1st Step
The first step is to choose a k-mer size, and split the
original sequence into its k-mer components( if sequence
is given)
Whole sequence
T A A T G C C A T G G G A TAA
ATG
ATG
CAT
CCA
GCC
TGC
AAT
TGG
GGG
GGA
Create Nodes.
Build edges.
Create directions.
Construction of graph
Nodes
 We consider each k-mer as edge.
 2 nodes for each edge
TAA
? ?
Nodes
 First :
Prefix of edge
 Second :
Suffix of edge
TAA
TA AA
N-1
K-mers
TAA
ATG
ATG
CAT
CCA
GCC
TGC
AAT
TGG
GGG
GGA
T A A T G C C A T G G G A
TAA
TA AA
TGC
TG GC
ATG
AT TG
AAT
AA AT
Highlight the similar nodes closer
TAA
TA AA
TGC
TG GC
ATG
AT TG
Make the similar nodes closer
TAA
TA AA
TGC
TG GC
ATG
AT TG
Glue the similar nodes
TAA
TA AA
TGC
TG GC
ATG
AT TG
AAT
Glue the similar nodes
2nd Step
Then a directed graph is constructed by connecting pairs
of k-mers with overlaps between the first k-1
nucleotides and the last k-1 nucleotides
For All K-mers
Glue the similar nodes again.
 Use whole sequence of first edge
 The Only Use the last alphabet from the
intermediate edges.
 Build the genome.
Construction of Genome
T A A T G C C A
T G G G A
De bruijn graph and genome
assembly
 Genome assembly:
 It simply the genome sequence produced after chromosomes have been
fragmented, those fragments have been sequenced, and the resulting
sequences have been put back together.
De bruijn graph and NGS
 Mostly used for Sanger reads
 NGS instruments rapidly and inexpensively decode the sequences of millions of
small fragments
 Read size remained short
 With shorter reads, it was more common to find spurious overlaps between read
pairs.
 Short
 Shorter
 Collectively characterists of genome lost
De bruijn graph based genome
assembly algorithm
Drawbacks
 De Bruijn graphs do not preserve positional information.
 . Longer the read size, more one has to lose.
Any Question?

More Related Content

What's hot

What's hot (20)

Needleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmNeedleman-Wunsch Algorithm
Needleman-Wunsch Algorithm
 
Dotplots for Bioinformatics
Dotplots for BioinformaticsDotplots for Bioinformatics
Dotplots for Bioinformatics
 
Alignments
AlignmentsAlignments
Alignments
 
Multiple Alignment Sequence using Clustal Omega/ Shumaila Riaz
Multiple Alignment Sequence using Clustal Omega/ Shumaila RiazMultiple Alignment Sequence using Clustal Omega/ Shumaila Riaz
Multiple Alignment Sequence using Clustal Omega/ Shumaila Riaz
 
Blast Algorithm
Blast AlgorithmBlast Algorithm
Blast Algorithm
 
Whole genome sequencing
Whole genome sequencingWhole genome sequencing
Whole genome sequencing
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Clustal
ClustalClustal
Clustal
 
Regulation of DNA replication
Regulation of DNA replication Regulation of DNA replication
Regulation of DNA replication
 
Applying Hidden Markov Models to Bioinformatics
Applying Hidden Markov Models to BioinformaticsApplying Hidden Markov Models to Bioinformatics
Applying Hidden Markov Models to Bioinformatics
 
ChIP-seq
ChIP-seqChIP-seq
ChIP-seq
 
The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment
 
Biological networks - building and visualizing
Biological networks - building and visualizingBiological networks - building and visualizing
Biological networks - building and visualizing
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Genomics seminar
Genomics seminarGenomics seminar
Genomics seminar
 
Finding genes
Finding genesFinding genes
Finding genes
 
Global and local alignment (bioinformatics)
Global and local alignment (bioinformatics)Global and local alignment (bioinformatics)
Global and local alignment (bioinformatics)
 
NGS: Mapping and de novo assembly
NGS: Mapping and de novo assemblyNGS: Mapping and de novo assembly
NGS: Mapping and de novo assembly
 
dot plot analysis
dot plot analysisdot plot analysis
dot plot analysis
 
UPGMA
UPGMAUPGMA
UPGMA
 

Similar to De bruijn graphs

Recreation mathematics ppt
Recreation mathematics pptRecreation mathematics ppt
Recreation mathematics pptPawan Yadav
 
Collective dynamics of ‘small-world’ networks
Collective dynamics of ‘small-world’ networksCollective dynamics of ‘small-world’ networks
Collective dynamics of ‘small-world’ networksTokyo Tech
 
7 convolutional codes
7 convolutional codes7 convolutional codes
7 convolutional codesVarun Raj
 
Turbo Detection in Rayleigh flat fading channel with unknown statistics
Turbo Detection in Rayleigh flat fading channel with unknown statisticsTurbo Detection in Rayleigh flat fading channel with unknown statistics
Turbo Detection in Rayleigh flat fading channel with unknown statisticsijwmn
 
SkNoushadddoja_28100119039.pptx
SkNoushadddoja_28100119039.pptxSkNoushadddoja_28100119039.pptx
SkNoushadddoja_28100119039.pptxPrakasBhowmik
 
minimisation of crosstalk in VLSI routing
minimisation of crosstalk in VLSI routingminimisation of crosstalk in VLSI routing
minimisation of crosstalk in VLSI routingChandrajit Pal
 
Iterative Soft Decision Based Complex K-best MIMO Decoder
Iterative Soft Decision Based Complex K-best MIMO DecoderIterative Soft Decision Based Complex K-best MIMO Decoder
Iterative Soft Decision Based Complex K-best MIMO DecoderCSCJournals
 
Iterative Soft Decision Based Complex K-best MIMO Decoder
Iterative Soft Decision Based Complex K-best MIMO DecoderIterative Soft Decision Based Complex K-best MIMO Decoder
Iterative Soft Decision Based Complex K-best MIMO DecoderCSCJournals
 
Pattern Recognition and Machine Learning : Graphical Models
Pattern Recognition and Machine Learning : Graphical ModelsPattern Recognition and Machine Learning : Graphical Models
Pattern Recognition and Machine Learning : Graphical Modelsbutest
 
A Quest for Subexponential Time Parameterized Algorithms for Planar-k-Path: F...
A Quest for Subexponential Time Parameterized Algorithms for Planar-k-Path: F...A Quest for Subexponential Time Parameterized Algorithms for Planar-k-Path: F...
A Quest for Subexponential Time Parameterized Algorithms for Planar-k-Path: F...cseiitgn
 
Mrongraphs acm-sig-2 (1)
Mrongraphs acm-sig-2 (1)Mrongraphs acm-sig-2 (1)
Mrongraphs acm-sig-2 (1)Nima Sarshar
 
SAGEposterAnanda2014
SAGEposterAnanda2014SAGEposterAnanda2014
SAGEposterAnanda2014Dev Ananda
 
APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...
APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...
APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...KUSHDHIRRA2111026030
 
A Modified Dna Computing Approach To Tackle The Exponential Solution Space Of...
A Modified Dna Computing Approach To Tackle The Exponential Solution Space Of...A Modified Dna Computing Approach To Tackle The Exponential Solution Space Of...
A Modified Dna Computing Approach To Tackle The Exponential Solution Space Of...ijfcstjournal
 
The Max Cut Problem
The Max Cut ProblemThe Max Cut Problem
The Max Cut Problemdnatapov
 

Similar to De bruijn graphs (20)

Recreation mathematics ppt
Recreation mathematics pptRecreation mathematics ppt
Recreation mathematics ppt
 
Collective dynamics of ‘small-world’ networks
Collective dynamics of ‘small-world’ networksCollective dynamics of ‘small-world’ networks
Collective dynamics of ‘small-world’ networks
 
7 convolutional codes
7 convolutional codes7 convolutional codes
7 convolutional codes
 
Turbo Detection in Rayleigh flat fading channel with unknown statistics
Turbo Detection in Rayleigh flat fading channel with unknown statisticsTurbo Detection in Rayleigh flat fading channel with unknown statistics
Turbo Detection in Rayleigh flat fading channel with unknown statistics
 
SkNoushadddoja_28100119039.pptx
SkNoushadddoja_28100119039.pptxSkNoushadddoja_28100119039.pptx
SkNoushadddoja_28100119039.pptx
 
Small world
Small worldSmall world
Small world
 
Minimum spanning tree
Minimum spanning treeMinimum spanning tree
Minimum spanning tree
 
minimisation of crosstalk in VLSI routing
minimisation of crosstalk in VLSI routingminimisation of crosstalk in VLSI routing
minimisation of crosstalk in VLSI routing
 
Basics of Genome Assembly
Basics of Genome Assembly Basics of Genome Assembly
Basics of Genome Assembly
 
Iterative Soft Decision Based Complex K-best MIMO Decoder
Iterative Soft Decision Based Complex K-best MIMO DecoderIterative Soft Decision Based Complex K-best MIMO Decoder
Iterative Soft Decision Based Complex K-best MIMO Decoder
 
Iterative Soft Decision Based Complex K-best MIMO Decoder
Iterative Soft Decision Based Complex K-best MIMO DecoderIterative Soft Decision Based Complex K-best MIMO Decoder
Iterative Soft Decision Based Complex K-best MIMO Decoder
 
Pattern Recognition and Machine Learning : Graphical Models
Pattern Recognition and Machine Learning : Graphical ModelsPattern Recognition and Machine Learning : Graphical Models
Pattern Recognition and Machine Learning : Graphical Models
 
A Quest for Subexponential Time Parameterized Algorithms for Planar-k-Path: F...
A Quest for Subexponential Time Parameterized Algorithms for Planar-k-Path: F...A Quest for Subexponential Time Parameterized Algorithms for Planar-k-Path: F...
A Quest for Subexponential Time Parameterized Algorithms for Planar-k-Path: F...
 
Mrongraphs acm-sig-2 (1)
Mrongraphs acm-sig-2 (1)Mrongraphs acm-sig-2 (1)
Mrongraphs acm-sig-2 (1)
 
SAGEposterAnanda2014
SAGEposterAnanda2014SAGEposterAnanda2014
SAGEposterAnanda2014
 
APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...
APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...
APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...
 
Dot matrix seminar
Dot matrix seminarDot matrix seminar
Dot matrix seminar
 
Dna sequencing
Dna sequencingDna sequencing
Dna sequencing
 
A Modified Dna Computing Approach To Tackle The Exponential Solution Space Of...
A Modified Dna Computing Approach To Tackle The Exponential Solution Space Of...A Modified Dna Computing Approach To Tackle The Exponential Solution Space Of...
A Modified Dna Computing Approach To Tackle The Exponential Solution Space Of...
 
The Max Cut Problem
The Max Cut ProblemThe Max Cut Problem
The Max Cut Problem
 

Recently uploaded

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 

Recently uploaded (20)

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 

De bruijn graphs