SlideShare a Scribd company logo
1 of 7
LTR-RetrotransposonsLTR-Retrotransposons
of Chimpanzee genomeof Chimpanzee genome
Abhishek DabralAbhishek Dabral
IntroductionIntroduction
►Chimpanzee Genome was downloadedChimpanzee Genome was downloaded
from Ensemble database.from Ensemble database.
►Chimp genome was mined for Long terminalChimp genome was mined for Long terminal
retrotransposons using a data miningretrotransposons using a data mining
program, LTR_STRUC,in conjunction withprogram, LTR_STRUC,in conjunction with
conventional techniques.conventional techniques.
Flow chart For Identification Of LTR
Retrotransposons
Genome Sequence
Structure based
element prediction
e.g. LTR_STRUC
Element Set
Similarity
searches
e.g. BLAST
Exhaustive similarity
Searches
e.g. BLAST
Sequence
Analysis
e.g. ClustalX
Phylogenetic
Analysis
e.g. MEGA
Families
Genome
Mapping
e.g. BLAT
Gene-element
Associations
Multiple
Alignments
e.g. ClustalW
Consensus
e.g. Consensus
Specialized Databases
e.g. Repbase
General Databases
e.g. NCBI1
3
2
FB
1
2
3
Fig 2: Workflow for identification of LTR- Retrotransposons in a
genome
Rectangles represent data.Diamonds represent steps involving the
usage of bioinformatics tools.Solid arrows represent the general flow
of information.Gray arrow represent the refinements.Black arrows
indicate the alternate flow of information. Numbers indicate the order
of steps. FB: Feed Back
LTR-STRUCLTR-STRUC
► Searches for the presence of Long terminalSearches for the presence of Long terminal
repeats on either side of element with in a certainrepeats on either side of element with in a certain
range of base pairs.range of base pairs.
► If the LTR’s are found then it searches for theIf the LTR’s are found then it searches for the
presence of other characteristic features of LTRpresence of other characteristic features of LTR
retrotransposons b/w the two putative LTR’s.retrotransposons b/w the two putative LTR’s.
► Assigns a score from 0 – 2.0 to the hits dependingAssigns a score from 0 – 2.0 to the hits depending
on the presence of the characteristic features.on the presence of the characteristic features.
► Reports all the hits above the score of 0.3Reports all the hits above the score of 0.3
LTR-STRUC on ChimpLTR-STRUC on Chimp
GenomeGenome
•Total number of hits above score of 0.3 :
2056
2056 (LTR identity: 43.5 –99.6 %)
With out RT With RT
1959 97 (LTR identity: 71.4 – 99.4%)
Score < 0.7 Score > 0.7
42 55 (LTR identity 71.4 – 99.4 %)
32 23 (LTR identity:
RT conserved motifs presentRT conserved motifs absent
•No correlation between LTR identity and score
Identification Of RT encoding ORFIdentification Of RT encoding ORF
• 23 RTs identified are subjected to sequence analysis to determine
the RT encoding Open Reading Frame (ORF) from the 3 ORFs
given by LTR_STRUC
Briefly sequence analysis involves:
• The amino acid sequences from the three open reading frames of
the RT are aligned with previously annotated RTs using ClustalX.
• They are checked for the presence of conserved RT domains as
described by Eickbush et al to determine the ORF encoding RT.
• The ClustalX predicted RT encoding ORF is further
subjected to BLASTp searches against NCBI non redundant
database using default parameters to confirm the prediction.
Phylogenetic Analysis Of The Initial RT’s
ERV9
HERV9
Chimp1
Chimp2
HERV30
Chimp3
HervW
HER
V17
HERV
P
HervF
HER
VFH19
HervH
HervF
b
HERVH48
HERVXAHERVFH21HervZHerv FRD
HervHS49c23
HERV R Tybe bChimp4
HERVE
Chimp5
Chimp6
HervR
HERV3
RRHervI
Herv.s71
FELV
baboon end
GALV
Phasco
M
ulVPERVMDEV
Chimp7Chimp8HervADP
Chimp9
HervIChimp10
Chimp11
HERVIP10F
GYPSY
humanfoam
BLV
SIV
HIV
RSV
Chimp12
HBCA
mmtv(2)SRV-1
GH-G18
RERV
Chim
p13
C
him
p14
Chim
p15
HERV
HM
L5
HERVK22I
Chimp16
Herv HML6
HERVK3I
Chimp17
Chimp18
Chimp19
Chimp20HERVK9I
Chimp21
Chimp22Chimp23
Chimp24
Chimp25
HERVK11I
Chim
p26
HERVK11DI
HERVK13I
HERVK14I
HERVK14CI
HERVKC4
Chimp27
HervK
HervK(2)HERVK(3)
HervS
HervL
HERV16
Chimp28
Chimp29
0.1
Class 1
Class 2
Class 3
May be Chimp specific
May be Chimp specific

More Related Content

Similar to LTR-Retrotransposons of Chimpanzee genome

Fly chromatin dynamics using bidirectional hidden markov model
Fly chromatin dynamics using bidirectional hidden markov modelFly chromatin dynamics using bidirectional hidden markov model
Fly chromatin dynamics using bidirectional hidden markov modelSanju K. Sinha
 
Wellstein poster embl meeting nov 2018
Wellstein poster embl meeting nov 2018Wellstein poster embl meeting nov 2018
Wellstein poster embl meeting nov 2018Anne Deslattes Mays
 
SAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionSAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionAashish Patel
 
Help
HelpHelp
HelpYaCui
 
Comparing the early ciRNA papers
Comparing the early ciRNA papers Comparing the early ciRNA papers
Comparing the early ciRNA papers Darya Vanichkina
 
Help2
Help2Help2
Help2YaCui
 
Use of TGIRT for ssDNA-seq
Use of TGIRT for ssDNA-seqUse of TGIRT for ssDNA-seq
Use of TGIRT for ssDNA-seqDouglas Wu
 
DNA barcode sequence identification incorporating taxonomic hierarchy and wit...
DNA barcode sequence identification incorporating taxonomic hierarchy and wit...DNA barcode sequence identification incorporating taxonomic hierarchy and wit...
DNA barcode sequence identification incorporating taxonomic hierarchy and wit...Raunak Shrestha
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxChijiokeNsofor
 
Araport Data Integration - 2015 UMD Minisymposium
Araport Data Integration - 2015 UMD MinisymposiumAraport Data Integration - 2015 UMD Minisymposium
Araport Data Integration - 2015 UMD MinisymposiumVivek Krishnakumar
 
eRNA_QTL_website.pptx
eRNA_QTL_website.pptxeRNA_QTL_website.pptx
eRNA_QTL_website.pptxxuelianma
 
Metagenomic Data Analysis and Microbial Genomics
Metagenomic Data Analysis and Microbial GenomicsMetagenomic Data Analysis and Microbial Genomics
Metagenomic Data Analysis and Microbial GenomicsFabio Gori
 
Lecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generationLecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generationMohamedHasan816582
 
Marker devt. workshop 27022012
Marker devt. workshop 27022012Marker devt. workshop 27022012
Marker devt. workshop 27022012Koppolu Ravi
 
New methods for high-throughput nucleic sequencing and diagnostics using a th...
New methods for high-throughput nucleic sequencing and diagnostics using a th...New methods for high-throughput nucleic sequencing and diagnostics using a th...
New methods for high-throughput nucleic sequencing and diagnostics using a th...Douglas Wu
 
Ttp Lab Tech Talk 051810
Ttp Lab Tech Talk 051810Ttp Lab Tech Talk 051810
Ttp Lab Tech Talk 051810Neil Kubica
 

Similar to LTR-Retrotransposons of Chimpanzee genome (20)

Fly chromatin dynamics using bidirectional hidden markov model
Fly chromatin dynamics using bidirectional hidden markov modelFly chromatin dynamics using bidirectional hidden markov model
Fly chromatin dynamics using bidirectional hidden markov model
 
Wellstein poster embl meeting nov 2018
Wellstein poster embl meeting nov 2018Wellstein poster embl meeting nov 2018
Wellstein poster embl meeting nov 2018
 
sequencing-methods-review
sequencing-methods-reviewsequencing-methods-review
sequencing-methods-review
 
SAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionSAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene Expression
 
Help
HelpHelp
Help
 
Comparing the early ciRNA papers
Comparing the early ciRNA papers Comparing the early ciRNA papers
Comparing the early ciRNA papers
 
Help2
Help2Help2
Help2
 
Cufflinks
CufflinksCufflinks
Cufflinks
 
Use of TGIRT for ssDNA-seq
Use of TGIRT for ssDNA-seqUse of TGIRT for ssDNA-seq
Use of TGIRT for ssDNA-seq
 
Q biomarkercn
Q biomarkercnQ biomarkercn
Q biomarkercn
 
DNA barcode sequence identification incorporating taxonomic hierarchy and wit...
DNA barcode sequence identification incorporating taxonomic hierarchy and wit...DNA barcode sequence identification incorporating taxonomic hierarchy and wit...
DNA barcode sequence identification incorporating taxonomic hierarchy and wit...
 
Gene Prediction
Gene PredictionGene Prediction
Gene Prediction
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
 
Araport Data Integration - 2015 UMD Minisymposium
Araport Data Integration - 2015 UMD MinisymposiumAraport Data Integration - 2015 UMD Minisymposium
Araport Data Integration - 2015 UMD Minisymposium
 
eRNA_QTL_website.pptx
eRNA_QTL_website.pptxeRNA_QTL_website.pptx
eRNA_QTL_website.pptx
 
Metagenomic Data Analysis and Microbial Genomics
Metagenomic Data Analysis and Microbial GenomicsMetagenomic Data Analysis and Microbial Genomics
Metagenomic Data Analysis and Microbial Genomics
 
Lecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generationLecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generation
 
Marker devt. workshop 27022012
Marker devt. workshop 27022012Marker devt. workshop 27022012
Marker devt. workshop 27022012
 
New methods for high-throughput nucleic sequencing and diagnostics using a th...
New methods for high-throughput nucleic sequencing and diagnostics using a th...New methods for high-throughput nucleic sequencing and diagnostics using a th...
New methods for high-throughput nucleic sequencing and diagnostics using a th...
 
Ttp Lab Tech Talk 051810
Ttp Lab Tech Talk 051810Ttp Lab Tech Talk 051810
Ttp Lab Tech Talk 051810
 

LTR-Retrotransposons of Chimpanzee genome

  • 1. LTR-RetrotransposonsLTR-Retrotransposons of Chimpanzee genomeof Chimpanzee genome Abhishek DabralAbhishek Dabral
  • 2. IntroductionIntroduction ►Chimpanzee Genome was downloadedChimpanzee Genome was downloaded from Ensemble database.from Ensemble database. ►Chimp genome was mined for Long terminalChimp genome was mined for Long terminal retrotransposons using a data miningretrotransposons using a data mining program, LTR_STRUC,in conjunction withprogram, LTR_STRUC,in conjunction with conventional techniques.conventional techniques.
  • 3. Flow chart For Identification Of LTR Retrotransposons Genome Sequence Structure based element prediction e.g. LTR_STRUC Element Set Similarity searches e.g. BLAST Exhaustive similarity Searches e.g. BLAST Sequence Analysis e.g. ClustalX Phylogenetic Analysis e.g. MEGA Families Genome Mapping e.g. BLAT Gene-element Associations Multiple Alignments e.g. ClustalW Consensus e.g. Consensus Specialized Databases e.g. Repbase General Databases e.g. NCBI1 3 2 FB 1 2 3 Fig 2: Workflow for identification of LTR- Retrotransposons in a genome Rectangles represent data.Diamonds represent steps involving the usage of bioinformatics tools.Solid arrows represent the general flow of information.Gray arrow represent the refinements.Black arrows indicate the alternate flow of information. Numbers indicate the order of steps. FB: Feed Back
  • 4. LTR-STRUCLTR-STRUC ► Searches for the presence of Long terminalSearches for the presence of Long terminal repeats on either side of element with in a certainrepeats on either side of element with in a certain range of base pairs.range of base pairs. ► If the LTR’s are found then it searches for theIf the LTR’s are found then it searches for the presence of other characteristic features of LTRpresence of other characteristic features of LTR retrotransposons b/w the two putative LTR’s.retrotransposons b/w the two putative LTR’s. ► Assigns a score from 0 – 2.0 to the hits dependingAssigns a score from 0 – 2.0 to the hits depending on the presence of the characteristic features.on the presence of the characteristic features. ► Reports all the hits above the score of 0.3Reports all the hits above the score of 0.3
  • 5. LTR-STRUC on ChimpLTR-STRUC on Chimp GenomeGenome •Total number of hits above score of 0.3 : 2056 2056 (LTR identity: 43.5 –99.6 %) With out RT With RT 1959 97 (LTR identity: 71.4 – 99.4%) Score < 0.7 Score > 0.7 42 55 (LTR identity 71.4 – 99.4 %) 32 23 (LTR identity: RT conserved motifs presentRT conserved motifs absent •No correlation between LTR identity and score
  • 6. Identification Of RT encoding ORFIdentification Of RT encoding ORF • 23 RTs identified are subjected to sequence analysis to determine the RT encoding Open Reading Frame (ORF) from the 3 ORFs given by LTR_STRUC Briefly sequence analysis involves: • The amino acid sequences from the three open reading frames of the RT are aligned with previously annotated RTs using ClustalX. • They are checked for the presence of conserved RT domains as described by Eickbush et al to determine the ORF encoding RT. • The ClustalX predicted RT encoding ORF is further subjected to BLASTp searches against NCBI non redundant database using default parameters to confirm the prediction.
  • 7. Phylogenetic Analysis Of The Initial RT’s ERV9 HERV9 Chimp1 Chimp2 HERV30 Chimp3 HervW HER V17 HERV P HervF HER VFH19 HervH HervF b HERVH48 HERVXAHERVFH21HervZHerv FRD HervHS49c23 HERV R Tybe bChimp4 HERVE Chimp5 Chimp6 HervR HERV3 RRHervI Herv.s71 FELV baboon end GALV Phasco M ulVPERVMDEV Chimp7Chimp8HervADP Chimp9 HervIChimp10 Chimp11 HERVIP10F GYPSY humanfoam BLV SIV HIV RSV Chimp12 HBCA mmtv(2)SRV-1 GH-G18 RERV Chim p13 C him p14 Chim p15 HERV HM L5 HERVK22I Chimp16 Herv HML6 HERVK3I Chimp17 Chimp18 Chimp19 Chimp20HERVK9I Chimp21 Chimp22Chimp23 Chimp24 Chimp25 HERVK11I Chim p26 HERVK11DI HERVK13I HERVK14I HERVK14CI HERVKC4 Chimp27 HervK HervK(2)HERVK(3) HervS HervL HERV16 Chimp28 Chimp29 0.1 Class 1 Class 2 Class 3 May be Chimp specific May be Chimp specific