SlideShare a Scribd company logo
1 of 12
Download to read offline
RNA 2nd
structure prediction
based on multiple alignments
RNA evolution
●
Homologous RNAs can have a common 2nd
structure
without sharing a significant sequence similarity
●
Mutations can lead to compensatory mutations to
maintain the base-paring complementarity
Comparative sequence analysis
●
In a structurally correct multiple alignment of RNAs,
conserved base pairs are often revealed by the
presence of frequent correlated compensatory
mutations
●
Measure sequence covariation: mutual information
● fXi
is the frequency of one of the five possible characters
observed in col i: four nucleotides + gap
● fXi,Xj
is the joint frequency of the pairs observed in columns i
and j
Mij = ∑
xi , x j
f xi , x j
log2
f xi , x j
f xi
⋅f x j
Mutual information
G U C U G G A C
G A C U G G U C
G G C U G G C C
Mij = ∑
xi , x j
f xi , x j
log2
f xi , x j
f xi
⋅f x j
M2,7 = 3⋅(1
3
⋅log2
1/ 3
1/ 9)= log2 3 ≈ 1.59
● Mij
is maximum if i and j appear completely random but
are perfectly correlated
●
if i and j are uncorrelated, the mutual information is 0
●
if either i or j are highly conserved positions, we also get
little or no mutual information
Mutual information
● Mij
is maximum if i and j appear completely random but
are perfectly correlated
●
if i and j are uncorrelated, the mutual information is 0
●
if either i or j are highly conserved positions, we also get
little or no mutual information
Mij = ∑
xi , x j
f xi , x j
log2
f xi , x j
f xi
⋅f x j
M2,7 = 4⋅(1
4
⋅log2
1 /4
1/16)= 2
M1,8 = log2
1
1
= 0
G U C U G G A C
G A C U G G U C
G G C U G G C C
G C C U G G G C
Comparative analysis
●
Start with a multiple alignment
●
Predict 2nd
structure base on alignment
●
Refine alignment based on 2nd
structure
●
Repeat
●
The sequences to be compared must be sufficiently:
●
similar that they can be initially aligned by primary
sequence
●
dissimilar that a number of co-varying substitutions can be
detected
Comparative analysis
●
How to build 2nd
structure based on alignment?
●
Greedy method
● choose the pair of columns that have the highest Mij
●
make a base pair
● carry on with the second highest Mij
●
Problem columns might end up in more than one base pair
Nussinov and alignments
●
Notations
●
aln the RNA alignment
● alnk
the kth
sequence in the alignment
●
aln[i, j] the RNA alignment from position i to j
●
str the best 2nd
structure for aln
(over alphabet {(, ), .})
●
str[i, j] the best2nd
structure for aln[i, j]
●
score[i, j] the number of base pairs in str[i, j]
● aln[i] · aln[j] if for all k, alnk
[i] · alnk
[j]
Nussinov and alignments
●
i unpaired and str[i+1, j]
●
j unpaired and str[i, j-1]
●
aln[i] · aln[j] and str[i+1, j-1]
●
str[i, k] and str[k+1, j]
for some i < k < j
i ji+1
i jj-1
i ji+1 j-1
i jk k+1
Nussinov and alignments
●
Scoring base pairs
●
on one sequence + 1
● on an alignment + 1 + Mij
●
Base pairs between columns with high mutual
information are favoured
●
Other scoring schemes?
True
Nussinov on alignment
Nussinov single
From alignment structure
to sequence structure
A C G - - A A - U
. . . . .(1
(2
)2
)1
A C G A A U
. . . .(1
)1

More Related Content

Viewers also liked

Sequential Folding Model of RNA secondary structure
Sequential Folding Model of RNA secondary structureSequential Folding Model of RNA secondary structure
Sequential Folding Model of RNA secondary structureLi Tai Fang
 
AB-RNA-Nussinov-2011
AB-RNA-Nussinov-2011AB-RNA-Nussinov-2011
AB-RNA-Nussinov-2011Paula Tataru
 
RNA secondary structure prediction
RNA secondary structure predictionRNA secondary structure prediction
RNA secondary structure predictionMuhammed sadiq
 
Structurs of dna and rna
Structurs of dna and rnaStructurs of dna and rna
Structurs of dna and rnaGayathri91098
 
structure types and function of RNA
structure types and function of RNAstructure types and function of RNA
structure types and function of RNAadnandinmohammed
 
Structure of dna and rna
Structure of dna and rnaStructure of dna and rna
Structure of dna and rnaHimanshu Dev
 

Viewers also liked (8)

Sequential Folding Model of RNA secondary structure
Sequential Folding Model of RNA secondary structureSequential Folding Model of RNA secondary structure
Sequential Folding Model of RNA secondary structure
 
AB-RNA-Nussinov-2011
AB-RNA-Nussinov-2011AB-RNA-Nussinov-2011
AB-RNA-Nussinov-2011
 
RNA secondary structure prediction
RNA secondary structure predictionRNA secondary structure prediction
RNA secondary structure prediction
 
Rna secondary structure prediction
Rna secondary structure predictionRna secondary structure prediction
Rna secondary structure prediction
 
Structurs of dna and rna
Structurs of dna and rnaStructurs of dna and rna
Structurs of dna and rna
 
RNA interference
RNA interferenceRNA interference
RNA interference
 
structure types and function of RNA
structure types and function of RNAstructure types and function of RNA
structure types and function of RNA
 
Structure of dna and rna
Structure of dna and rnaStructure of dna and rna
Structure of dna and rna
 

More from Paula Tataru

More from Paula Tataru (19)

write_thesis
write_thesiswrite_thesis
write_thesis
 
Thiele
ThieleThiele
Thiele
 
PhDretreat2014
PhDretreat2014PhDretreat2014
PhDretreat2014
 
PhDretreat2011
PhDretreat2011PhDretreat2011
PhDretreat2011
 
PaulaTataru_PhD_defense
PaulaTataru_PhD_defensePaulaTataru_PhD_defense
PaulaTataru_PhD_defense
 
part A
part Apart A
part A
 
birc-csd2012
birc-csd2012birc-csd2012
birc-csd2012
 
TreeOfLife-jeopardy-2014
TreeOfLife-jeopardy-2014TreeOfLife-jeopardy-2014
TreeOfLife-jeopardy-2014
 
AB-RNA-comparison-2011
AB-RNA-comparison-2011AB-RNA-comparison-2011
AB-RNA-comparison-2011
 
AB-RNA-SCFGdesign=2010
AB-RNA-SCFGdesign=2010AB-RNA-SCFGdesign=2010
AB-RNA-SCFGdesign=2010
 
AB-RNA-SCFG-2010
AB-RNA-SCFG-2010AB-RNA-SCFG-2010
AB-RNA-SCFG-2010
 
AB-RNA-alignments-2010
AB-RNA-alignments-2010AB-RNA-alignments-2010
AB-RNA-alignments-2010
 
PaulaTataruVienna
PaulaTataruViennaPaulaTataruVienna
PaulaTataruVienna
 
PaulaTataruCSHL
PaulaTataruCSHLPaulaTataruCSHL
PaulaTataruCSHL
 
PaulaTataruAarhus
PaulaTataruAarhusPaulaTataruAarhus
PaulaTataruAarhus
 
mgsa_poster
mgsa_postermgsa_poster
mgsa_poster
 
PaulaTataruOxford
PaulaTataruOxfordPaulaTataruOxford
PaulaTataruOxford
 
PaulaTataru
PaulaTataruPaulaTataru
PaulaTataru
 
Mols_August2013
Mols_August2013Mols_August2013
Mols_August2013
 

AB-RNA-alignments-2011

  • 1. RNA 2nd structure prediction based on multiple alignments
  • 2. RNA evolution ● Homologous RNAs can have a common 2nd structure without sharing a significant sequence similarity ● Mutations can lead to compensatory mutations to maintain the base-paring complementarity
  • 3. Comparative sequence analysis ● In a structurally correct multiple alignment of RNAs, conserved base pairs are often revealed by the presence of frequent correlated compensatory mutations ● Measure sequence covariation: mutual information ● fXi is the frequency of one of the five possible characters observed in col i: four nucleotides + gap ● fXi,Xj is the joint frequency of the pairs observed in columns i and j Mij = ∑ xi , x j f xi , x j log2 f xi , x j f xi ⋅f x j
  • 4. Mutual information G U C U G G A C G A C U G G U C G G C U G G C C Mij = ∑ xi , x j f xi , x j log2 f xi , x j f xi ⋅f x j M2,7 = 3⋅(1 3 ⋅log2 1/ 3 1/ 9)= log2 3 ≈ 1.59 ● Mij is maximum if i and j appear completely random but are perfectly correlated ● if i and j are uncorrelated, the mutual information is 0 ● if either i or j are highly conserved positions, we also get little or no mutual information
  • 5. Mutual information ● Mij is maximum if i and j appear completely random but are perfectly correlated ● if i and j are uncorrelated, the mutual information is 0 ● if either i or j are highly conserved positions, we also get little or no mutual information Mij = ∑ xi , x j f xi , x j log2 f xi , x j f xi ⋅f x j M2,7 = 4⋅(1 4 ⋅log2 1 /4 1/16)= 2 M1,8 = log2 1 1 = 0 G U C U G G A C G A C U G G U C G G C U G G C C G C C U G G G C
  • 6. Comparative analysis ● Start with a multiple alignment ● Predict 2nd structure base on alignment ● Refine alignment based on 2nd structure ● Repeat ● The sequences to be compared must be sufficiently: ● similar that they can be initially aligned by primary sequence ● dissimilar that a number of co-varying substitutions can be detected
  • 7. Comparative analysis ● How to build 2nd structure based on alignment? ● Greedy method ● choose the pair of columns that have the highest Mij ● make a base pair ● carry on with the second highest Mij ● Problem columns might end up in more than one base pair
  • 8. Nussinov and alignments ● Notations ● aln the RNA alignment ● alnk the kth sequence in the alignment ● aln[i, j] the RNA alignment from position i to j ● str the best 2nd structure for aln (over alphabet {(, ), .}) ● str[i, j] the best2nd structure for aln[i, j] ● score[i, j] the number of base pairs in str[i, j] ● aln[i] · aln[j] if for all k, alnk [i] · alnk [j]
  • 9. Nussinov and alignments ● i unpaired and str[i+1, j] ● j unpaired and str[i, j-1] ● aln[i] · aln[j] and str[i+1, j-1] ● str[i, k] and str[k+1, j] for some i < k < j i ji+1 i jj-1 i ji+1 j-1 i jk k+1
  • 10. Nussinov and alignments ● Scoring base pairs ● on one sequence + 1 ● on an alignment + 1 + Mij ● Base pairs between columns with high mutual information are favoured ● Other scoring schemes?
  • 12. From alignment structure to sequence structure A C G - - A A - U . . . . .(1 (2 )2 )1 A C G A A U . . . .(1 )1