Homework Assignment 3: Chapter 3
St. Clair & Visick
, Putting your skills into practice, problem 5
Tuesday, October 28
Homework Assignment 3 will be due Tuesday, November 4.
What changes are needed to construct a semi-global alignment like in the third homework assignment? The global alignment works pretty well on sequences that are nearly the same length. Let's try another example where the sequence lengths are more disparate.
$ ruby global.rb -d cgctatag cta
Dynamic programming table:
|
|
c
|
g
|
c
|
t
|
a
|
t
|
a
|
g
|
----+------+------+------+------+------+------+------+------+------+
|
|
|
|
|
|
|
|
|
|
|
0 |<
-1 |<
-2 |<
-3 |<
-4 |<
-5 |<
-6 |<
-7 |<
-8 |
----+------+------+------+------+------+------+------+------+------+
|
^ |\
|
|\
|
|
|
|
|
|
c
|
-1 |
1 |<
0 |<
-1 |<
-2 |<
-3 |<
-4 |<
-5 |<
-6 |
----+------+------+------+------+------+------+------+------+------+
|
^ |
^ |\
|\
|\
|
|\
|
|
|
t
|
-2 |
0 |
1 |<
0 |
0 |<
-1 |<
-2 |<
-3 |<
-4 |
----+------+------+------+------+------+------+------+------+------+
|
^ |
^ |\
^ |\
|\
|\
|
|\
|
|
a
|
-3 |
-1 |
0 |
1 |<
0 |
1 |<
0 |<
-1 |<
-2 |
----+------+------+------+------+------+------+------+------+------+
Alignment 1
cgctatag
__c__ta_
Alignment 2
cgctatag
c____ta_
Alignment 3
cgctatag
__ct__a_
Alignment 4
cgctatag
c__t__a_
Alignment 5
cgctatag
__cta___
Alignment 6
cgctatag
c__ta___
The 5th alignment really looks better here even though they all 6 scored the same -2. The problem is that terminal gaps are scored the same as internal gaps. If we are trying to see if a short sequence lines up best with a similar sized piece that is somewhere inside the longer sequence, internal gaps need to have a larger penalty than terminal gaps. If the terminal gap penalty was reduced to
0
while the other scoring stayed the same, that should get the desired result where the 5th alignment is clearly the best with a score of
3
. Simply modifying how the global alignment program fills in the outside rows and columns of the dynamic programming table should be all that is required to do a semi-global alignment.
$ ruby semi-global.rb -d cgctatag cta
Dynamic programming table:
|
|
C
|
G
|
C
|
T
|
A
|
T
|
A
|
G
|
----+------+------+------+------+------+------+------+------+------+
|
|
|
|
|
|
|
|
|
|
|
0 |<
0 |<
0 |<
0 |<
0 |<
0 |<
0 |<
0 |<
0 |
----+------+------+------+------+------+------+------+------+------+
|
^ |\
|\
|\
|\
|\
|\
|\
|\
^ |
C
|
0 |
1 |<
0 |
1 |<
0 |
0 |
0 |
0 |
0 |
----+------+------+------+------+------+------+------+------+------+
|
^ |\
^ |\
|\
^ |\
|
|\
|\
|\
^ |
T
|
0 |
0 |
1 |<
0 |
2 |<
1 |
1 |<
0 |
0 |
----+------+------+------+------+------+------+------+------+------+
|
^ |\
|\
^ |\
|
^ |\
|
|
|
|
A
|
0 |<
0 |<
0 |
1 |<
1 |
3 |<
3 |<
3 |<
3 |
----+------+------+------+------+------+---.
1. Homework Assignment 3: Chapter 3
St. Clair & Visick
, Putting your skills into practice, problem 5
Tuesday, October 28
Homework Assignment 3 will be due Tuesday, November 4.
What changes are needed to construct a semi-global alignment
like in the third homework assignment? The global alignment
works pretty well on sequences that are nearly the same length.
Let's try another example where the sequence lengths are more
disparate.
$ ruby global.rb -d cgctatag cta
Dynamic programming table:
|
|
c
|
g
|
c
|
t
|
a
7. -1 |<
-2 |
----+------+------+------+------+------+------+------+------+------+
Alignment 1
cgctatag
__c__ta_
Alignment 2
cgctatag
c____ta_
Alignment 3
cgctatag
__ct__a_
Alignment 4
cgctatag
c__t__a_
Alignment 5
cgctatag
__cta___
Alignment 6
cgctatag
c__ta___
The 5th alignment really looks better here even though they all
6 scored the same -2. The problem is that terminal gaps are
scored the same as internal gaps. If we are trying to see if a
short sequence lines up best with a similar sized piece that is
somewhere inside the longer sequence, internal gaps need to
have a larger penalty than terminal gaps. If the terminal gap
penalty was reduced to
0
while the other scoring stayed the same, that should get the
desired result where the 5th alignment is clearly the best with a
score of
3
. Simply modifying how the global alignment program fills in
8. the outside rows and columns of the dynamic programming
table should be all that is required to do a semi-global
alignment.
$ ruby semi-global.rb -d cgctatag cta
Dynamic programming table:
|
|
C
|
G
|
C
|
T
|
A
|
T
|
A
|
18. Sequence 2 0760
__________________________________________
From global to local by comparing the recursion functions:
Comments on early submissions:
score = (0..last_row).inject([]) {|s,e| s << [(e == 0
? e:e = 0) * Sigma]}
# This might work by accident but is very unclear.
# If what you want to say with the ternary operator is the score
is 0
# in the first column then you don't need the ternary operator at
all
# s << [0]
# I suppose you could use
# s << [(e == 0 ? e : 0)*Sigma]
# but it is redundant.
# Putting an assignment inside a ternary operator is a bad idea.
# If you really need to allow a side effect like that then perhaps
# you should use a short cut logical operator like &&.
From
# the standpoint of clear, functional programming even this is a
kludge.