2. 5’- ATGAATACGGCTATCACC - 3’
DNA pattern
recognized
by a protein
Question : How to find this DNA pattern
elsewhere in the DNA sequence ?
To find these patterns we use a consensus
sequence.
• However, variations of pattern exist
3. Example of Consensus sequence of the fixation site :
Count Matrix of the consensus sequence :
A
A
T
A T
C
A
T T
G
C G
G
C C C
A 8 2 0 6 0 0 2
T 0 2 4 0 4 0 2
G 0 2 0 0 4 0 2
C 0 2 4 2 0 8 2
Transformation in Matrix
Example based on 8
conserved sequences analysis
4. Two Types of matrices to give two
answers to the problem :
Entropy Matrix Log-odd Matrix
Count Matrix
Entropy Matrix
Calculation
Results and calculus of scores
Application of the Entropy Matrix
to a DNA sequence
Count Matrix
Log-odd Matrix
Calculation
Results and calculus of scores
Application of the Log-odd Matrix
to a DNA sequence
5. Example of sequence analysed :
Positions 1 2 3 4 5 6 7 8 9 10 11 12
Nucleic Acids A T A C G C T A C G A T
Example of count matrix and entropy matrix :
A
C
G
T
7. • Cost in time of the importing of a sequence : the
sequence is read 2 times :
• 1st time to put all letters in upper case
• 2nd time to check the quality of the sequence
• When entering the treshold value, the user can enter
any type of character
• The analysis time is long :
• Bigger is the sequence, longer is the analysis.
• Less powerful is the processor, more the time of
calculating is long.
Limits and Improvements :