SlideShare a Scribd company logo
Gene Duplication
and Read Mapping
Lecture – 7
Department of CSE, DIU
CONTENTS
1. Mutation
2. Gene Duplication
3. Read Mapping
- Keyword Tree
- Suffix Tree
- Suffix Array
- Burrows Wheeler Transform
1. DNA Mutation
What and how mutation occurs, common forms
Mutation
DNA Mutation refers to sudden, random changes
in DNA sequences which leads to different
phenotypic expressions.
A T C C G A
A T G C C G A
Insertion
Common Mutation Types
Substitution
AATTCGCA
AATGCGCA
Deletion
AATTCGCA
AATCGCA
Insertion
AATCGCA
AATTCGCA
Duplication
AATCGCA
AATCATCGCA
Inversion
AATCGCA AACGGCA
AGCATCG ACTATCG
2. Gene Duplication
Duplication of Genes, Homolog, Ortholog, Paralogs
Gene Duplication
Gene duplication (or chromosomal duplication or
gene amplification) is a major mechanism
through which new genetic material is generated
during molecular evolution. It can be defined as
any duplication of a region of DNA that contains a
gene.
Homolog, Ortholog, Paralog and Speciation
• Homolog - A gene related to a second
gene by descent from a common
ancestral DNA sequence
• Ortholog - Orthologs are genes in
different species that evolved from a
common ancestral gene by speciation*
• Paralog - Paralogs are genes related by
duplication within a genome
• Speciation* - Speciation is the origin of a
new species capable of making a living in
a new way from the species from which
it arose
3. Read Mapping
Short Read Mapping, Genome Indexing
Read Mapping
Mapping refers to the process of
aligning short reads to and finding
the starting position in a reference
sequence (typically Genome).
Short read generally are reads with a
length of 30-350 base pairs.
Genome Indexing (Keyword Tree)
▹ Stores a set of keywords in a rooted labeled
tree.
▹ Each edge is labeled with a letter from an
alphabet.
▹ Any two edges coming out of the same
vertex have distinct labels.
▹ Every keyword stored can be spelled on a
path from root to some leaf.
▹ Furthermore, every path from root to leaf
gives a keyword.
Keywords
▹ Apple
▹ Apropos
▹ Banana
▹ Bandana
▹ Orange
Genome Indexing (Suffix Tree)
▹ Similar to Keyword Tree
▹ Suffixes of the text are keywords
▹ Edges that form paths are collapsed
▹ Each edge is labeled with a substring of the
text
▹ All internal edges have at least two outgoing
edges.
▹ Leaves are labeled by the index of the
pattern.
Suffix tree of ATCATG
Genome Indexing (Suffix Array)
▹ More space efficient than suffix tree
▹ Suffix tree index for human genome is
about 47 GB
▹ Lexicographically sort all the suffixes
▹ Store the starting indices of the suffixes
along with the original string
Generate Suffix Array of
ATCATG
1 ATCATG$
2 TCATG$
3 CATG$
4 ATG$
5 TG$
6 G$
7 $
Sort the suffixes
lexicographically
7 $
1 ATCATG$
4 ATG$
3 CATG$
6 G$
2 TCATG$
5 TG$
Genome Indexing (Burrows Wheeler Transform)
▹ Given Sequence – abaaba
▹ Add $ as ending notation –
abaaba$
▹ By Shifting each alphabet to the
right once, generate all the
rotations
▹ Lexicographically Sort all the
rotations
▹ The very last column will be
denoted as BWT (T)
Genome Indexing (Burrows Wheeler Transform)
▹ Given Sequence – abaaba
▹ Add $ as ending notation –
abaaba$
▹ Lexicographically sorted all
rotations will generate BWT
Matrix which will be denoted as
BWM (T)
▹ Suffix Array generated from all
the rotations will be called SA (T)
▹ BWM can be derived from any
given BWT (T)
Genome Indexing (Burrows Wheeler Transform)
LF (Last to First) Mapping
▹ Generate Burrows Wheeler
Matrix for a given sequence
▹ Assign numbers to distinguish
same characters
▹ Assign the numbers in a
ascending manner for each
character
Genome Indexing (Burrows Wheeler Transform)
Find out the row starting with b1 using LF Mapping
1. Start from the row containing $ in the First Column
2. Find out what’s in Last Column of that row (here its a0)
3. Compare it with query (b1)
4. If MATCH, then
- Find b1 in First Column
- Print row number
- Terminate
5. If No MATCH, then
- Find the row with that element in the First column
- Go to Step 2 and Repeat
Start
Genome Indexing (Burrows Wheeler Transform)
Find Original Gene using LF Mapping if
BWT (T) is Given
1. Original Gene = abaaba (Not Given)
2. Given BWT (T) = abba$aa
3. Store it as Last Column
4. Draw the First Column by sorting
the elements of Last Column
Lexicographically
5. Assign numbers to distinguish
characters in an ascending manner
6. Start LF Mapping from Starting
Element ($)
7. For each element found in the
LAST column, write it from right to
left
$ a0
a0 b0
a1 b1
a2 a1
a3 $
b0 a2
b1 a3
F L
$
a
b
a
a
b
a
FINISH
Start
Whales and Dolphins
Their ancestors had back legs once, they could walk
Humans have tails
While they are inside the womb! It dissolves
eventually.
Birds came from Dinosaurs
And they both descended from Reptiles
Bacterium
All livings beings can be traced back to a
bacterium

More Related Content

What's hot

Transposones
TransposonesTransposones
Transposones
microbiology Notes
 
Molecular clock, Neutral hypothesis
Molecular clock, Neutral hypothesisMolecular clock, Neutral hypothesis
Molecular clock, Neutral hypothesis
Freya Cardozo
 
Regulation of gene expression in prokaryotes
Regulation of gene expression in prokaryotesRegulation of gene expression in prokaryotes
Regulation of gene expression in prokaryotes
Biswajit Sahoo
 
Tryptophan operon
Tryptophan operonTryptophan operon
Tryptophan operon
devadevi666
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
Hafiz Muhammad Zeeshan Raza
 
Linkage mapping
Linkage mappingLinkage mapping
Linkage mapping
SnehaSahu20
 
Transposable elements in Maize And Drosophila
Transposable elements in Maize And DrosophilaTransposable elements in Maize And Drosophila
Transposable elements in Maize And Drosophila
Subhradeep sarkar
 
Transposons in drosophila - P element
Transposons in drosophila - P elementTransposons in drosophila - P element
Transposons in drosophila - P element
Jaserah Syed
 
Eukaryotic Chromosome Organisation
Eukaryotic Chromosome OrganisationEukaryotic Chromosome Organisation
Eukaryotic Chromosome Organisation
Meera C R
 
5’ capping
5’ capping5’ capping
5’ capping
EmaSushan
 
Gene regulation in prokaryotes
Gene regulation in prokaryotesGene regulation in prokaryotes
Gene regulation in prokaryotes
Neha Agarwal
 
Express sequence tags
Express sequence tagsExpress sequence tags
Express sequence tags
Dhananjay Desai
 
Dna methylation
Dna methylationDna methylation
Dna methylation
AnuKiruthika
 
Second genetic code overlapping and split genes
Second genetic code overlapping and split genesSecond genetic code overlapping and split genes
Second genetic code overlapping and split genes
gohil sanjay bhagvanji
 
Operon
Operon Operon
Operon
sijiskariah
 
Transcription Regulation
Transcription Regulation Transcription Regulation
Transcription Regulation IshaqueAbdulla
 
Lac operon
Lac operonLac operon
Lac operon
Tapeshwar Yadav
 
Transcription in eukaryotes
Transcription in eukaryotesTranscription in eukaryotes
Transcription in eukaryotes
Hemantkrdu
 
Transposons ppt
Transposons pptTransposons ppt
Transposons ppt
Zeeshan Ahmed
 

What's hot (20)

Transposones
TransposonesTransposones
Transposones
 
Molecular clock, Neutral hypothesis
Molecular clock, Neutral hypothesisMolecular clock, Neutral hypothesis
Molecular clock, Neutral hypothesis
 
Dna methylation
Dna methylationDna methylation
Dna methylation
 
Regulation of gene expression in prokaryotes
Regulation of gene expression in prokaryotesRegulation of gene expression in prokaryotes
Regulation of gene expression in prokaryotes
 
Tryptophan operon
Tryptophan operonTryptophan operon
Tryptophan operon
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
 
Linkage mapping
Linkage mappingLinkage mapping
Linkage mapping
 
Transposable elements in Maize And Drosophila
Transposable elements in Maize And DrosophilaTransposable elements in Maize And Drosophila
Transposable elements in Maize And Drosophila
 
Transposons in drosophila - P element
Transposons in drosophila - P elementTransposons in drosophila - P element
Transposons in drosophila - P element
 
Eukaryotic Chromosome Organisation
Eukaryotic Chromosome OrganisationEukaryotic Chromosome Organisation
Eukaryotic Chromosome Organisation
 
5’ capping
5’ capping5’ capping
5’ capping
 
Gene regulation in prokaryotes
Gene regulation in prokaryotesGene regulation in prokaryotes
Gene regulation in prokaryotes
 
Express sequence tags
Express sequence tagsExpress sequence tags
Express sequence tags
 
Dna methylation
Dna methylationDna methylation
Dna methylation
 
Second genetic code overlapping and split genes
Second genetic code overlapping and split genesSecond genetic code overlapping and split genes
Second genetic code overlapping and split genes
 
Operon
Operon Operon
Operon
 
Transcription Regulation
Transcription Regulation Transcription Regulation
Transcription Regulation
 
Lac operon
Lac operonLac operon
Lac operon
 
Transcription in eukaryotes
Transcription in eukaryotesTranscription in eukaryotes
Transcription in eukaryotes
 
Transposons ppt
Transposons pptTransposons ppt
Transposons ppt
 

Similar to Lecture-7-Gene Duplication (1).pptx

Lecture 5
Lecture 5Lecture 5
Lecture 5
Owali Shawon
 
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation OverviewPathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema
 
phylogenetics.pdf
phylogenetics.pdfphylogenetics.pdf
phylogenetics.pdf
SrimathideviJ
 
RNA-Seq_Presentation
RNA-Seq_PresentationRNA-Seq_Presentation
RNA-Seq_PresentationToyin23
 
Cg7 trees
Cg7 treesCg7 trees
Cg7 trees
Anasua Sarkar
 
PHYLOGENETIC ANALYSIS_CSS2.pptx
PHYLOGENETIC ANALYSIS_CSS2.pptxPHYLOGENETIC ANALYSIS_CSS2.pptx
PHYLOGENETIC ANALYSIS_CSS2.pptx
Silpa87
 
Bioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmmBioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmm
Prof. Wim Van Criekinge
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Nawfal Aldujaily
 
Maximum parsimony
Maximum parsimonyMaximum parsimony
Maximum parsimony
Shruthi Krishnaswamy
 
Marker devt. workshop 27022012
Marker devt. workshop 27022012Marker devt. workshop 27022012
Marker devt. workshop 27022012
Koppolu Ravi
 
Molecular phylogenetics
Molecular phylogeneticsMolecular phylogenetics
Molecular phylogenetics
Ajay Kumar Chandra
 
Genome rearrangement
Genome rearrangementGenome rearrangement
Genome rearrangement
Pinky Vincent
 
Rna seq pipeline
Rna seq pipelineRna seq pipeline
Rna seq pipeline
Karan Veer Singh
 
SyMAP Master's Thesis Presentation
SyMAP Master's Thesis PresentationSyMAP Master's Thesis Presentation
SyMAP Master's Thesis Presentation
austinps
 
Exploring the Solution Space of Sorting by Reversals: A New Approach
Exploring the Solution Space of Sorting by Reversals: A New ApproachExploring the Solution Space of Sorting by Reversals: A New Approach
Exploring the Solution Space of Sorting by Reversals: A New Approach
IDES Editor
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research Community
Monica Munoz-Torres
 
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
Torsten Seemann
 
Bioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matricesBioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matrices
Prof. Wim Van Criekinge
 
Practicum Pressentation PDF
Practicum Pressentation PDFPracticum Pressentation PDF
Practicum Pressentation PDFGui Chen
 
RNASeq Experiment Design
RNASeq Experiment DesignRNASeq Experiment Design
RNASeq Experiment Design
Yaoyu Wang
 

Similar to Lecture-7-Gene Duplication (1).pptx (20)

Lecture 5
Lecture 5Lecture 5
Lecture 5
 
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation OverviewPathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
 
phylogenetics.pdf
phylogenetics.pdfphylogenetics.pdf
phylogenetics.pdf
 
RNA-Seq_Presentation
RNA-Seq_PresentationRNA-Seq_Presentation
RNA-Seq_Presentation
 
Cg7 trees
Cg7 treesCg7 trees
Cg7 trees
 
PHYLOGENETIC ANALYSIS_CSS2.pptx
PHYLOGENETIC ANALYSIS_CSS2.pptxPHYLOGENETIC ANALYSIS_CSS2.pptx
PHYLOGENETIC ANALYSIS_CSS2.pptx
 
Bioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmmBioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmm
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Maximum parsimony
Maximum parsimonyMaximum parsimony
Maximum parsimony
 
Marker devt. workshop 27022012
Marker devt. workshop 27022012Marker devt. workshop 27022012
Marker devt. workshop 27022012
 
Molecular phylogenetics
Molecular phylogeneticsMolecular phylogenetics
Molecular phylogenetics
 
Genome rearrangement
Genome rearrangementGenome rearrangement
Genome rearrangement
 
Rna seq pipeline
Rna seq pipelineRna seq pipeline
Rna seq pipeline
 
SyMAP Master's Thesis Presentation
SyMAP Master's Thesis PresentationSyMAP Master's Thesis Presentation
SyMAP Master's Thesis Presentation
 
Exploring the Solution Space of Sorting by Reversals: A New Approach
Exploring the Solution Space of Sorting by Reversals: A New ApproachExploring the Solution Space of Sorting by Reversals: A New Approach
Exploring the Solution Space of Sorting by Reversals: A New Approach
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research Community
 
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
 
Bioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matricesBioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matrices
 
Practicum Pressentation PDF
Practicum Pressentation PDFPracticum Pressentation PDF
Practicum Pressentation PDF
 
RNASeq Experiment Design
RNASeq Experiment DesignRNASeq Experiment Design
RNASeq Experiment Design
 

Recently uploaded

Daan Park Hydrangea flower season I like it
Daan Park Hydrangea flower season I like itDaan Park Hydrangea flower season I like it
Daan Park Hydrangea flower season I like it
a0966109726
 
Drip Irrigation technology with solar power
Drip Irrigation technology with solar powerDrip Irrigation technology with solar power
Drip Irrigation technology with solar power
anikchanda4
 
Wildlife-AnIntroduction.pdf so that you know more about our environment
Wildlife-AnIntroduction.pdf so that you know more about our environmentWildlife-AnIntroduction.pdf so that you know more about our environment
Wildlife-AnIntroduction.pdf so that you know more about our environment
amishajha2407
 
Characterization and the Kinetics of drying at the drying oven and with micro...
Characterization and the Kinetics of drying at the drying oven and with micro...Characterization and the Kinetics of drying at the drying oven and with micro...
Characterization and the Kinetics of drying at the drying oven and with micro...
Open Access Research Paper
 
UNDERSTANDING WHAT GREEN WASHING IS!.pdf
UNDERSTANDING WHAT GREEN WASHING IS!.pdfUNDERSTANDING WHAT GREEN WASHING IS!.pdf
UNDERSTANDING WHAT GREEN WASHING IS!.pdf
JulietMogola
 
Improving the viability of probiotics by encapsulation methods for developmen...
Improving the viability of probiotics by encapsulation methods for developmen...Improving the viability of probiotics by encapsulation methods for developmen...
Improving the viability of probiotics by encapsulation methods for developmen...
Open Access Research Paper
 
Peatland Management in Indonesia, Science to Policy and Knowledge Education
Peatland Management in Indonesia, Science to Policy and Knowledge EducationPeatland Management in Indonesia, Science to Policy and Knowledge Education
Peatland Management in Indonesia, Science to Policy and Knowledge Education
Global Landscapes Forum (GLF)
 
How about Huawei mobile phone-www.cfye-commerce.shop
How about Huawei mobile phone-www.cfye-commerce.shopHow about Huawei mobile phone-www.cfye-commerce.shop
How about Huawei mobile phone-www.cfye-commerce.shop
laozhuseo02
 
Celebrating World-environment-day-2024.pdf
Celebrating  World-environment-day-2024.pdfCelebrating  World-environment-day-2024.pdf
Celebrating World-environment-day-2024.pdf
rohankumarsinghrore1
 
一比一原版(UMTC毕业证书)明尼苏达大学双城分校毕业证如何办理
一比一原版(UMTC毕业证书)明尼苏达大学双城分校毕业证如何办理一比一原版(UMTC毕业证书)明尼苏达大学双城分校毕业证如何办理
一比一原版(UMTC毕业证书)明尼苏达大学双城分校毕业证如何办理
zm9ajxup
 
Top 8 Strategies for Effective Sustainable Waste Management.pdf
Top 8 Strategies for Effective Sustainable Waste Management.pdfTop 8 Strategies for Effective Sustainable Waste Management.pdf
Top 8 Strategies for Effective Sustainable Waste Management.pdf
Jhon Wick
 
Promoting Multilateral Cooperation for Sustainable Peatland management
Promoting Multilateral Cooperation for Sustainable Peatland managementPromoting Multilateral Cooperation for Sustainable Peatland management
Promoting Multilateral Cooperation for Sustainable Peatland management
Global Landscapes Forum (GLF)
 
Epcon is One of the World's leading Manufacturing Companies.
Epcon is One of the World's leading Manufacturing Companies.Epcon is One of the World's leading Manufacturing Companies.
Epcon is One of the World's leading Manufacturing Companies.
EpconLP
 
different Modes of Insect Plant Interaction
different Modes of Insect Plant Interactiondifferent Modes of Insect Plant Interaction
different Modes of Insect Plant Interaction
Archita Das
 
Peatlands of Latin America and the Caribbean
Peatlands of Latin America and the CaribbeanPeatlands of Latin America and the Caribbean
Peatlands of Latin America and the Caribbean
Global Landscapes Forum (GLF)
 
Enhanced action and stakeholder engagement for sustainable peatland management
Enhanced action and stakeholder engagement for sustainable peatland managementEnhanced action and stakeholder engagement for sustainable peatland management
Enhanced action and stakeholder engagement for sustainable peatland management
Global Landscapes Forum (GLF)
 
"Understanding the Carbon Cycle: Processes, Human Impacts, and Strategies for...
"Understanding the Carbon Cycle: Processes, Human Impacts, and Strategies for..."Understanding the Carbon Cycle: Processes, Human Impacts, and Strategies for...
"Understanding the Carbon Cycle: Processes, Human Impacts, and Strategies for...
MMariSelvam4
 
AGRICULTURE Hydrophonic FERTILISER PPT.pptx
AGRICULTURE Hydrophonic FERTILISER PPT.pptxAGRICULTURE Hydrophonic FERTILISER PPT.pptx
AGRICULTURE Hydrophonic FERTILISER PPT.pptx
BanitaDsouza
 
Overview of the Global Peatlands Assessment
Overview of the Global Peatlands AssessmentOverview of the Global Peatlands Assessment
Overview of the Global Peatlands Assessment
Global Landscapes Forum (GLF)
 
Improving the Management of Peatlands and the Capacities of Stakeholders in I...
Improving the Management of Peatlands and the Capacities of Stakeholders in I...Improving the Management of Peatlands and the Capacities of Stakeholders in I...
Improving the Management of Peatlands and the Capacities of Stakeholders in I...
Global Landscapes Forum (GLF)
 

Recently uploaded (20)

Daan Park Hydrangea flower season I like it
Daan Park Hydrangea flower season I like itDaan Park Hydrangea flower season I like it
Daan Park Hydrangea flower season I like it
 
Drip Irrigation technology with solar power
Drip Irrigation technology with solar powerDrip Irrigation technology with solar power
Drip Irrigation technology with solar power
 
Wildlife-AnIntroduction.pdf so that you know more about our environment
Wildlife-AnIntroduction.pdf so that you know more about our environmentWildlife-AnIntroduction.pdf so that you know more about our environment
Wildlife-AnIntroduction.pdf so that you know more about our environment
 
Characterization and the Kinetics of drying at the drying oven and with micro...
Characterization and the Kinetics of drying at the drying oven and with micro...Characterization and the Kinetics of drying at the drying oven and with micro...
Characterization and the Kinetics of drying at the drying oven and with micro...
 
UNDERSTANDING WHAT GREEN WASHING IS!.pdf
UNDERSTANDING WHAT GREEN WASHING IS!.pdfUNDERSTANDING WHAT GREEN WASHING IS!.pdf
UNDERSTANDING WHAT GREEN WASHING IS!.pdf
 
Improving the viability of probiotics by encapsulation methods for developmen...
Improving the viability of probiotics by encapsulation methods for developmen...Improving the viability of probiotics by encapsulation methods for developmen...
Improving the viability of probiotics by encapsulation methods for developmen...
 
Peatland Management in Indonesia, Science to Policy and Knowledge Education
Peatland Management in Indonesia, Science to Policy and Knowledge EducationPeatland Management in Indonesia, Science to Policy and Knowledge Education
Peatland Management in Indonesia, Science to Policy and Knowledge Education
 
How about Huawei mobile phone-www.cfye-commerce.shop
How about Huawei mobile phone-www.cfye-commerce.shopHow about Huawei mobile phone-www.cfye-commerce.shop
How about Huawei mobile phone-www.cfye-commerce.shop
 
Celebrating World-environment-day-2024.pdf
Celebrating  World-environment-day-2024.pdfCelebrating  World-environment-day-2024.pdf
Celebrating World-environment-day-2024.pdf
 
一比一原版(UMTC毕业证书)明尼苏达大学双城分校毕业证如何办理
一比一原版(UMTC毕业证书)明尼苏达大学双城分校毕业证如何办理一比一原版(UMTC毕业证书)明尼苏达大学双城分校毕业证如何办理
一比一原版(UMTC毕业证书)明尼苏达大学双城分校毕业证如何办理
 
Top 8 Strategies for Effective Sustainable Waste Management.pdf
Top 8 Strategies for Effective Sustainable Waste Management.pdfTop 8 Strategies for Effective Sustainable Waste Management.pdf
Top 8 Strategies for Effective Sustainable Waste Management.pdf
 
Promoting Multilateral Cooperation for Sustainable Peatland management
Promoting Multilateral Cooperation for Sustainable Peatland managementPromoting Multilateral Cooperation for Sustainable Peatland management
Promoting Multilateral Cooperation for Sustainable Peatland management
 
Epcon is One of the World's leading Manufacturing Companies.
Epcon is One of the World's leading Manufacturing Companies.Epcon is One of the World's leading Manufacturing Companies.
Epcon is One of the World's leading Manufacturing Companies.
 
different Modes of Insect Plant Interaction
different Modes of Insect Plant Interactiondifferent Modes of Insect Plant Interaction
different Modes of Insect Plant Interaction
 
Peatlands of Latin America and the Caribbean
Peatlands of Latin America and the CaribbeanPeatlands of Latin America and the Caribbean
Peatlands of Latin America and the Caribbean
 
Enhanced action and stakeholder engagement for sustainable peatland management
Enhanced action and stakeholder engagement for sustainable peatland managementEnhanced action and stakeholder engagement for sustainable peatland management
Enhanced action and stakeholder engagement for sustainable peatland management
 
"Understanding the Carbon Cycle: Processes, Human Impacts, and Strategies for...
"Understanding the Carbon Cycle: Processes, Human Impacts, and Strategies for..."Understanding the Carbon Cycle: Processes, Human Impacts, and Strategies for...
"Understanding the Carbon Cycle: Processes, Human Impacts, and Strategies for...
 
AGRICULTURE Hydrophonic FERTILISER PPT.pptx
AGRICULTURE Hydrophonic FERTILISER PPT.pptxAGRICULTURE Hydrophonic FERTILISER PPT.pptx
AGRICULTURE Hydrophonic FERTILISER PPT.pptx
 
Overview of the Global Peatlands Assessment
Overview of the Global Peatlands AssessmentOverview of the Global Peatlands Assessment
Overview of the Global Peatlands Assessment
 
Improving the Management of Peatlands and the Capacities of Stakeholders in I...
Improving the Management of Peatlands and the Capacities of Stakeholders in I...Improving the Management of Peatlands and the Capacities of Stakeholders in I...
Improving the Management of Peatlands and the Capacities of Stakeholders in I...
 

Lecture-7-Gene Duplication (1).pptx

  • 1. Gene Duplication and Read Mapping Lecture – 7 Department of CSE, DIU
  • 2. CONTENTS 1. Mutation 2. Gene Duplication 3. Read Mapping - Keyword Tree - Suffix Tree - Suffix Array - Burrows Wheeler Transform
  • 3. 1. DNA Mutation What and how mutation occurs, common forms
  • 4. Mutation DNA Mutation refers to sudden, random changes in DNA sequences which leads to different phenotypic expressions. A T C C G A A T G C C G A Insertion
  • 6. 2. Gene Duplication Duplication of Genes, Homolog, Ortholog, Paralogs
  • 7. Gene Duplication Gene duplication (or chromosomal duplication or gene amplification) is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene.
  • 8. Homolog, Ortholog, Paralog and Speciation • Homolog - A gene related to a second gene by descent from a common ancestral DNA sequence • Ortholog - Orthologs are genes in different species that evolved from a common ancestral gene by speciation* • Paralog - Paralogs are genes related by duplication within a genome • Speciation* - Speciation is the origin of a new species capable of making a living in a new way from the species from which it arose
  • 9. 3. Read Mapping Short Read Mapping, Genome Indexing
  • 10. Read Mapping Mapping refers to the process of aligning short reads to and finding the starting position in a reference sequence (typically Genome). Short read generally are reads with a length of 30-350 base pairs.
  • 11. Genome Indexing (Keyword Tree) ▹ Stores a set of keywords in a rooted labeled tree. ▹ Each edge is labeled with a letter from an alphabet. ▹ Any two edges coming out of the same vertex have distinct labels. ▹ Every keyword stored can be spelled on a path from root to some leaf. ▹ Furthermore, every path from root to leaf gives a keyword. Keywords ▹ Apple ▹ Apropos ▹ Banana ▹ Bandana ▹ Orange
  • 12. Genome Indexing (Suffix Tree) ▹ Similar to Keyword Tree ▹ Suffixes of the text are keywords ▹ Edges that form paths are collapsed ▹ Each edge is labeled with a substring of the text ▹ All internal edges have at least two outgoing edges. ▹ Leaves are labeled by the index of the pattern. Suffix tree of ATCATG
  • 13. Genome Indexing (Suffix Array) ▹ More space efficient than suffix tree ▹ Suffix tree index for human genome is about 47 GB ▹ Lexicographically sort all the suffixes ▹ Store the starting indices of the suffixes along with the original string Generate Suffix Array of ATCATG 1 ATCATG$ 2 TCATG$ 3 CATG$ 4 ATG$ 5 TG$ 6 G$ 7 $ Sort the suffixes lexicographically 7 $ 1 ATCATG$ 4 ATG$ 3 CATG$ 6 G$ 2 TCATG$ 5 TG$
  • 14. Genome Indexing (Burrows Wheeler Transform) ▹ Given Sequence – abaaba ▹ Add $ as ending notation – abaaba$ ▹ By Shifting each alphabet to the right once, generate all the rotations ▹ Lexicographically Sort all the rotations ▹ The very last column will be denoted as BWT (T)
  • 15. Genome Indexing (Burrows Wheeler Transform) ▹ Given Sequence – abaaba ▹ Add $ as ending notation – abaaba$ ▹ Lexicographically sorted all rotations will generate BWT Matrix which will be denoted as BWM (T) ▹ Suffix Array generated from all the rotations will be called SA (T) ▹ BWM can be derived from any given BWT (T)
  • 16. Genome Indexing (Burrows Wheeler Transform) LF (Last to First) Mapping ▹ Generate Burrows Wheeler Matrix for a given sequence ▹ Assign numbers to distinguish same characters ▹ Assign the numbers in a ascending manner for each character
  • 17. Genome Indexing (Burrows Wheeler Transform) Find out the row starting with b1 using LF Mapping 1. Start from the row containing $ in the First Column 2. Find out what’s in Last Column of that row (here its a0) 3. Compare it with query (b1) 4. If MATCH, then - Find b1 in First Column - Print row number - Terminate 5. If No MATCH, then - Find the row with that element in the First column - Go to Step 2 and Repeat Start
  • 18. Genome Indexing (Burrows Wheeler Transform) Find Original Gene using LF Mapping if BWT (T) is Given 1. Original Gene = abaaba (Not Given) 2. Given BWT (T) = abba$aa 3. Store it as Last Column 4. Draw the First Column by sorting the elements of Last Column Lexicographically 5. Assign numbers to distinguish characters in an ascending manner 6. Start LF Mapping from Starting Element ($) 7. For each element found in the LAST column, write it from right to left $ a0 a0 b0 a1 b1 a2 a1 a3 $ b0 a2 b1 a3 F L $ a b a a b a FINISH Start
  • 19. Whales and Dolphins Their ancestors had back legs once, they could walk Humans have tails While they are inside the womb! It dissolves eventually. Birds came from Dinosaurs And they both descended from Reptiles Bacterium All livings beings can be traced back to a bacterium