Show Me The Cp G Islands!

478 views
431 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
478
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Better title!! A Statistical Vacation to the CpG Islands in Summer 2005 New title! Show me the cpg islands (with statistical significance)
  • Show Me The Cp G Islands!

    1. 1. Show Me the CpG Islands! <ul><li>Alicia Laughton (Mathematics ‘06) </li></ul><ul><li>Jessica Minnier (Mathematics ‘07) </li></ul><ul><li>Guided by Yung-Pin Chen (Mathematics/Statistics) </li></ul>(With Statistical Significance)
    2. 2. This work is funded by John S. Rogers Science Research Program
    3. 3. Outline <ul><li>DNA Overview </li></ul><ul><li>CpG Islands </li></ul><ul><li>Methods </li></ul><ul><ul><li>Traditional Method </li></ul></ul><ul><ul><li>Our Method </li></ul></ul><ul><li>Future plans </li></ul>
    4. 4. DNA <ul><li>Deoxyribonucleic acid </li></ul><ul><li>Double-helix </li></ul><ul><li>Chain of nucleotide subunits </li></ul><ul><li>Contains genetic information </li></ul>
    5. 5. Nucleotides <ul><li>Made up of sugar, Phosphate, and bases </li></ul><ul><li>Four bases </li></ul><ul><ul><li>Adenine (A) </li></ul></ul><ul><ul><li>Cytosine (C) </li></ul></ul><ul><ul><li>Guanine (G) </li></ul></ul><ul><ul><li>Thymine (T) </li></ul></ul><ul><li>CpG represents a C directly followed by a G in the DNA sequence </li></ul>
    6. 6. Methylation <ul><li>Causes C to turn into T </li></ul><ul><li>Accounts for low occurrence of CpG dinucleotides in vertebrates </li></ul><ul><ul><li>Expectation is 6.25% randomly </li></ul></ul><ul><ul><li>Actually 1% of total sequence (Bird 1986) </li></ul></ul>
    7. 7. Sequence AL031723 <ul><li>Human DNA sequence on chromosome 16 </li></ul><ul><li>3 known CpG Islands </li></ul><ul><li>Percentage of Content: </li></ul><ul><ul><li>A - 22.7% </li></ul></ul><ul><ul><li>C - 29.5% </li></ul></ul><ul><ul><li>G - 28.3% </li></ul></ul><ul><ul><li>T - 19.5% </li></ul></ul><ul><ul><li>CpG - 3.1% </li></ul></ul>
    8. 8. CpG Islands <ul><li>“ regions of DNA with a high G + C content and a high frequency of CpG dinucleotides relative to the bulk genome” -- Gardiner-Garden and Frommer (1987) </li></ul>
    9. 9. CpG islands & Genes Gene 5’ end CpGi Gene Promoter CpG islands Gene CpG islands in body Gene 3’ end CpG islands
    10. 10. What is important about CpG Islands? <ul><li>Useful in identifying protein-coding regions (Yoon and Vaidyanathan, 2004) </li></ul><ul><ul><li>Associated with “housekeeping genes” and 40% of tissue-specific genes </li></ul></ul><ul><li>Aberrant methylation of CpG sites may cause silencing of tumor-suppressor genes (Deng, Zhou et al, 2002) </li></ul>
    11. 11. aggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttgagacggagtcttgctctgtcacccaggctggagtgtagtggtgcaatctcggctcactgcacctctgcctcccgggttcaagcgattttcctgcctcagcccccggagtagctgggattacaggtgcccgccaccacacccagctaatttttgtatttttagtagagacggggtttcgccatgttggccaggctggtctcgaactcctgacctcaggtgatccgcctgcctcagcctcccaaagtgctgggattacagacgtgagccactgcggctggcctctctccccgtctttaactgtagccctgtgaattctcatcagcctgggcctggactcagcaggccaaaaagttaccagcagagcccagcacatgtgaggaaagtcggagacgtggcggcgccggccggaggatccttcccaagaccctgggccgctgtggccccctagatcttgcaggttgccagggtgccaggccagggagggggcctttctgagattctcctcattctgacacaggagaggagggcactgacccagtcccaaggtcccgggggaatcagccgaccacagcccaggactgtcccacctgggcagagagcccattctgggtgcccagcccgggcaggcccaggcacccccagcagtgccccgggcagcacctgccagccaggtagtgcagggtgaggttgggcagggcagggcgtggtaggtcagctgagcaaacagctcggagggagagctggggagggctgggaactaggtcgatagaaacacagggactgtgttagggaggggatgccttgccagtcacgcccagccctgactcctgccctctgagggggcttcccccacccctgctgacagccccaggaccggcccctgccaggaggctgacctgccaggagtgaccgccccagacttgagcccttgggaggcaggttctgagtccccttttcctgctcagacccccagggaaacgcaggctgggccagaggcagctgcacagacccctgcagtggggtgctcggtggagagcgctggaggtgggagggaggatgtgtgaggcagcgggagagaatccaggcttcccccacaacacccaccatgagcggtgcagagtaggggtgggcggcacgggagccttcccaccccgcagaaccaggccctgggcagagctggcctacagacgataccggacaagtcctcctccgtcttggtgacagagggagctgggactccctccacccacccactgccacttcagaagcagccacagggagactgggaggggcaggggtgctggggatgagcgtggggctcagccctccctcttcccaccctggagggctgcctccttccagcccacctggaagggtggtgtcagtcccagagcccctgcactccccgccccacctcctgcagctggaacccgcgtgggagccgcacccagcgtcccagggacaaacacagaggccttgggtggtggcggtaccaaggtctgaggcctggcagctcaggggcacccccgtccctgagagaggtcaagaaggggaggcaccaccccccaccacgggacctcgctgacgatgcccatagagagaaaccaggccagtgctgggaggggaaagaccccaggcctcatgagaagtcactgcctgcttttcccctcggccaggaaggaagccccaggcccttccctcccgtctcgggcatactgaccccaggcaccaagcgagaccaggagcccacccctttcctttcccagatggcacaccagtgactctgaatat cggagcgcacccctgctccctgggaggcaggatatcgtgccgctgctccctggggcgcacgataccctccccaggaaggcgccggtcagggcggacgggccagggtgctcaccggtaccaggcgaggccgcgctcgtagcacctgtcgaagaagtggggctcagagcccagcgcgcggacgtcggggtgcagccgcagaaactccagcagggcgcgcgtgccgcccttcttcacgccaacgatgagcgcttgcgggaagcgccggcggccgggaccgctggccaaaggcaggccgggtgctcccgggcggtggacggagctggacggctcggagggcgcgggggccggcgcgggggcgcgggcggccggcgggcagcggccggggagggcgcagaggcagtaggcgccgagcaccagggccacgagcagcatcggcgcgcgggacgcccgcagagcggccccttgcccggcccctgcgccctggccgcccccggccccgccgcccaggccgccgctacctgccatggggtcgcgccgctccaggcccgggagcgggggcagcaggcgggcgcgcatctcggcccgcgcgccgctcagtccgtgggtgcccggcttgtgctctgcgcccggcggtcccgcagcctgggagcgggcgcggggcgggaccgggggcggggtctggacgccctcccccctccccctcccccgcccactccgcctccgaggccactgcctgggctggacccgccggcagccgccaccacccgggcgcgactcgagctgccgggaccaccaggacgctcctgctccgagatcccaggccctggctcgcttgactccggcatcttcacctctgcgcggggaggatgcggcggcggtggccgttcgggacgcagggcagggacagggcggcgcgcgggcctcgggaccctctgtttgaagaccgatccccttccccccccaccccactccgggacgtgcgcggcaggtgcataggccaagccttggcctgcaggagcgggagcctcatcgccaggccaaggggacccaggaaaagcgtcgatccgggcactcggcctgccaagggagaaagaggccgggacagcaccctagtgtgcagagagggatcccagaacgtgtggggggagtctgcggccgggaatggcgtgcg cctcctcttcctgcctgctggagggaccagcaccaaaacaggaaagttcaccctgccaggccttctctccaaagagtcagagggagctccgtagggggatggggttcccggaccccctgccgtggaaggggagtgggaacacagacaggcggcaagggctttcgaggccccctcttgcacaaaccagctcagagatcggagatctttgggatcaattactttccctccccaggcatccgaagcctatcctagcccaggtgtggatgagggtgggagagacgggggaggagggagaggagcaggactggacccccgtgtgacaaacatctgacaagttgctctgaggactgcccccctccttgtggagcccacctcatctggtgtgcatttccctgcggctttcatccagccctgggcgaccctccctcctccatctcagcctccctcctcctgccccacacctcaggcctgggactcgcagatgccaaaagggcctggcagatgccaaagccagaaagtgcagggggactgcatcccccacaggagaccgggttcttccccactacatactcagaccccactccctgcacccactgctcttgcaaaccaggaactaaggggttcccctacccaccccgctccttgcctcctcttgcttttcttttgttttgtttgtttttgagacagagctgcactccagctgactcttgtcgcccaggctggagtgcagtggcacaatctcagctcactacaacctctgcctcccgggttcaagcgattctcctgcctcagcctcccaagtagctgggaatacaggcacccatcaccacgcctggctaatttttgtatttttagtagagatggggtttcaccatgttagtcaggctggtctcaaactcctgacctcaggtaatctgcccacctcagcctcccaaagggctgggattacaggcgtgagccactgtgccccaccctcctcttgcttttctaaaagatgatggtcaaagtacagcccccatttgcccccagacagggcacccttcccagatcgagaccttggggagtctgcgtgacccccacacctggcagacacaggtgcttcactagtgggggaacggctgagcatgtgctgagctcgggggcactagtgggctacagtccccaagtgggaggcccctcaagagcctggatgagctgactgacggtggagaggagggaaggagggcctatggccaaagtcaatccaggacccaactgccgaggccacaggaaggccgggtcaccgcctggaactaggtcggtcacagcccagtgggagccgtggcccggagactcaactgggggccctggttactctgctcgcctccccgcgtcggcacccagaacagagcttgcaggcactgggggcccagtccagggtctcaagagcagacaatgctgccttgcagttggggaaactgagacagggtgagaactttcagaggctcattgcaggctcctagcaggctgaaaggacggaggcacaggcacctaggagcacaccagccccacgtggccacggcccctcggagagcatgaggacacttgcaatgcggaagctcagcaggcccagctctactggctctgcaccgcccagtgaggggtcagcacagttggtccaagggacaataccagattaatgaggcagaagccacgggactgaccccttggaattctccacacccacactgtgcatccttaacccaaagcttctagcttggtagcccctcctaccctcctccctgcagcagggattagggatgcattctgacccctgcctgccgtcaggggagtgaggtctctccctggagcctgagctgaggatgcccaattcagccaggtgagccccgggatggactccatgtcccctagccaccacctgacttccccagcaccccacactggcaccagcccttcagatctcagaagcgagccaccctattctcacggagccccttcctgcctgccctccaaacccaagagtagttttagtacaaaaggcaaagttaacaaataggggtaggcgtcagggaaggaagaggatcagaggatcgggaacggagaaactggagcacctggagaagcgtctgggtcctgccacccccactgactccccaactggccttgggcagggtcctctctgcaggcgctgggtccaagcttggggatgagcagccaccagcgcgggctgcttcagctgaggctgccgcacccccacgtccatcctgggtagaggcaggacagccacagagccccatgcacggggctggactcaccctgggcactcacctaaaggcagtctcctcctttccaaagcccagactttctccggactcccaggaccaccaacaagggttcctgtgcgcagactcgggggtcttggggaggaaggacgctttctaggtggctgcctggaacctggaggcccctttctacagtacctggccagcggtcggtcacacctgagtgcccagagtgagcgggcggcagaggcatttctgacgctgccaggtaatcccacgggctggaaacgacctctgggctgggaagccaccgcctcccccagtcctgctgggtccctcagcagagagaacggaaccggggctttccccacagttttcaaagtttcagggaatcctagccaagtatcattccttcttccggagccgggaccccaggtcaagcctggggcccccacagggcggtcccaaccccactgcccggagcgcacccctgctccctgggaggcaggatatcgtgccgctgctccctggggcgcacgataccctccccaggaaggcgccggtcagggcggacgggccagggtgctcaccggtaccaggcgaggccgcgctcgtagcacctgtcgaagaagtggggctcagagcccagcgcgcggacgtcggggtgcagccgcagaaactccagcagggcgcgcgtgccgcccttcttcacgccaacgatgagcgcttgcgggaagcgccggcggccgggaccgctggcgtttccctcccaggggcccagtggtgaactgaattcaggcctgagacatactctgtctactaagtcaccccatctgcccagccttggtccacctggcactgcccagagacatcagtgatgcatttcggaagctggcaaagtggaccccactggagtacaaaggactcagggacccctgtgctggggaagagaaggagcccaggacctcccccaggggctgcctctgaggggcgtgagattcaggggcctctcgggtgggacctgcgggggccgctagacactgcgggaacttcacatccccaacgcccagcagcagcctgcagggaaggcaggggaggcgagccgggctcagagagggcgagcaacttgccccatccgaaggcaaaggtggtatgagacccgggtcctctctccacctctgccccagccttcctggccacagggctggcgccaggcaggcacggcacaggctcccggcagaggccacggtctcagccatccccacggtctcaggagtccccacggtctcagccgtccccacggtctgagtccccacggtctcagctgttcccacggtctcaggagtccccacaggttcagcagtccccacggtctcagccatccccacggtctcagccgtccccacagtctcagccatccccacggtctcagcagtccctactcaggacttgaaattccagcactggttccgtgatggctcctccagccccctgcccagcccagcatggtcatttccatctcctggcctttccgctgccgtctctctgctggatgctttatccttagtccccgctgagggcagaaggactttccaggaggaattgaccagaacgcagaacagcaggatgtggaatggactggggacagggagagagagatgcagggaccaggagtcggctcggagggttctcctggaagctgacccctccctccatcaggcactcggctgacggtggctacacacctcggggcgcccaggatggcagcactggggctgttcattcaccagtggatccccagcacctaacagagcctggcacgcagtggacattccattaatgtcgctcagtggaagggtatacgtgggaggagaggtcgggaaggctttctggaggtgacggccaggtgaagacgaggagaacagcattccaggccaaggaaccgtgtgggtgaaggctcagcagcagagagcccgggcagtagaggatggggtggagcttaaggccctgcgggaacaggggcggggcttagagtctggcctgaggctggtccagccccgcctcctcctcaggctcccaccaactctgagccaccagaccctcctttgtaaaatgaagacctcagtcatgactcgcatgagtctctgaagagtaacagctttattgtgatgtaattcacacaccactcaatccagccatttgtcgcatgcaaatcaatggttttcagtatattcatagtcgtgcaatcacaatcaattttagaacatttctatcaccccaaaaagaaatcctgtgtccattagcaatgacgccctcttctccccttcccacagcccctggcaaccacgaatctactttctgtctctatgggtttgcctattctggacatttcacaaaaagagaatcattgcttgaagccaggagttcaagaccaacctgggcaacaaagcgagaaccccgtctgtacaaaatattttaaatttagccaggcacagtggcgcacaccagtagtcccagcactttggaagtctgaggcaggaggttcacttgaggcggggaattcaaaaccagcctgggcaacatagggagtaccagtctctacaaaaaatttcaaaatttgccaagcgtgatggtatgcacctatagtcctagcttactcaggaggctgaggtgggaggatcgcttgagcccaggagtacgaggctgcagtgagccatgatcataccactgcattccagcctgggcgacagagtgagagcccatctctaaaacagaaagaaagaaagaaagaaatatggccagtcacagtggctcatgcctgtaatcccagcattttgggaggccaaggcaggtggatcacttgaggtcaggagttcgagaccagcctggccaacatggtgaaaccctgtctctaccaaaaatacataaattagccaggtgtgggccaggcgccatggcttacacttgtaatcccagcactttgggaggccgaggtgggcagatcacctgaggttgagagttcgagaccagcctgaccaacatgaagaaaccctgtctctactaaaaatacaaaaaattagctgggtgtggtggtgcatgcctgtaatctcagctacttgggaggctgaggaaggagaatggcttgaacccgggaggcagaggttgtggtgagccgagatcgcgcgattgcactccagcctgggcaacaacagcaaaactccatctcaaataataataataataaattagccaggtgtggtggtgcacgcctgtagtcccagctactcgggaggctgaggcacaagaaacccttgaacccgggaggcagaggttgcagtgaagctgaaattgcaccattccactccagcctgggagacagagtgagacaccatctctaaaatgaaaaaaaaaaaagagaatcatacaatgttcgtccttttgtgtctgggtctcttactcagcatgttctccaggttcatcaacactgtggcatgtgccagtacctccttcctcttcctgactgagtaatactccatcgtatggatggaccaccttttgttgattccctcattcgttgatggacatctaggttgtttccactgcggggttcttagtaacggtattacagggaaccatagattaccaggtatt How do you locate the CpG island in a DNA sequence?
    12. 12. <ul><li>agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaa aaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg </li></ul>C+G Content: 0.492 Observed/Expected: 0.548
    13. 13. <ul><li>a gaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaa aaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg </li></ul>C+G Content: 0.501 Observed/Expected: 0.568
    14. 14. <ul><li>ag aattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaa aaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg </li></ul>C+G Content: 0.500 Observed/Expected: 0.560
    15. 15. <ul><li>agaattgcttgaaccgggaggcggaggttg caatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggc gtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaa aaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg </li></ul>C+G Content: 0.712 Observed/Expected: 0.604 200 steps later…
    16. 16. <ul><li>agaattgcttgaaccgggaggcggaggttg caatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaa aaaaaaaaaaaaaaaaaagtgcgac acgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcct ctcgc gtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggc ctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg </li></ul>C+G Content: 0.598 Observed/Expected: 0.421 600 steps later…
    17. 17. Just a couple formulas… <ul><li>G+C content = </li></ul><ul><ul><li>(# of C’s) + (# of G’s ) </li></ul></ul><ul><ul><li>length of window </li></ul></ul><ul><li>Obs/Exp ratio = </li></ul><ul><ul><li>Observed # of CpGs # of CpG’s in window </li></ul></ul><ul><ul><li>Expected # of CpGs (# of C’s)x(# of G’s) </li></ul></ul><ul><ul><li>length </li></ul></ul>= From window
    18. 18. Traditional Methods <ul><li>Gardiner-Garden and Frommer (1987) </li></ul><ul><ul><li>Window size 100 bp and Shift size 1bp </li></ul></ul><ul><ul><li>Criteria </li></ul></ul><ul><ul><ul><li>At least 200 base pairs </li></ul></ul></ul><ul><ul><ul><li>G + C content greater than 50% </li></ul></ul></ul><ul><ul><ul><li>Expected portion of the Obs/Exp ratio calculated over the window </li></ul></ul></ul><ul><ul><ul><li>Obs/Exp ratio greater than 0.6 </li></ul></ul></ul><ul><li>Takai and Jones (2002) </li></ul><ul><ul><li>Window size 200 bp and Shift size 1bp </li></ul></ul><ul><ul><li>Criteria </li></ul></ul><ul><ul><ul><li>At least 500 base pairs </li></ul></ul></ul><ul><ul><ul><li>At least 7 CpG dinucleotides in 200 base pair sequence </li></ul></ul></ul><ul><ul><ul><li>G + C content greater than 55% </li></ul></ul></ul><ul><ul><ul><li>Obs/Exp ratio calculated in same fashion as above method </li></ul></ul></ul><ul><ul><ul><li>Obs/Exp ratio greater than 0.65 </li></ul></ul></ul>
    19. 19. The Traditional Method C+G content Obs/Exp ratio C+G content /Obs-Exp ratio Base Position Sequence AL031723
    20. 20. <ul><li>Modifying the traditional methods </li></ul><ul><ul><li>Window size 200 bp and Shift size 1 bp </li></ul></ul><ul><ul><li>Expected portion of the Obs/Exp ratio is based on whole sequence </li></ul></ul><ul><li>And…. </li></ul>Our Method <ul><ul><li>Observed # of CpGs # of CpG’s in window </li></ul></ul><ul><ul><li>Expected # of CpGs (# of C’s)x(# of G’s) </li></ul></ul><ul><ul><li>length </li></ul></ul>= From entire sequence
    21. 21. <ul><li>Cutoffs greater than 97th percentile of observed sequence </li></ul>Obs/Exp Ratio G+C Content Mean: 0.0018 Standard Deviation: 0.0014 97th percentile: 0.0058 Mean: 0.5815 Standard Deviation: 0.0818 97th percentile: 0.7350 G+C Content Obs/Exp Ratio Number of Observations Number of Observations Sequence AL031723
    22. 22. Kullback-Leibler Divergence <ul><li>p ln (p/0.03) + (1-p) ln ((1-p)/(1-0.03)) </li></ul>p KL Divergence
    23. 23. <ul><li>agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtga aaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg </li></ul>Kullback-Leibler: 0.508 Our Obs/Exp: 0.0029 C+G: 0.492
    24. 24. <ul><li>a gaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaa accccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg </li></ul>Kullback-Leibler: 0.509 Our Obs/Exp: 0.0030 C+G: 0.501
    25. 25. <ul><li>ag aattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaa ccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg </li></ul>Kullback-Leibler: 0.507 Our Obs/Exp: 0.0029 C+G: 0.500
    26. 26. <ul><li>agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcacttt gggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggc gtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagc tttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg </li></ul>200 steps later… Kullback-Leibler: 0.520 Our Obs/Exp: 0.0033 C+G: 0.712
    27. 27. <ul><li>agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcacttt gggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctct cgc gtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcg gctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg </li></ul>600 steps later… Kullback-Leibler: 0.510 Our Obs/Exp: 0.0030 C+G: 0.598
    28. 28. Our Method KL Divergence*12 / Obs-Exp ratio*160 / C+G Content Base Position Kullback-Leibler Divergence Observed/Expected Ratio C+G Content Sequence AL031723
    29. 29. Comparison of AL031723 Traditional Method Our Method
    30. 30. Comparison of AL031723 Traditional Method Possible CpG Islands 3878-4534 5849-6136 6541-6820 8479-8698 10745-11049 18435-19580 25131-26359 35182-35441 36245-36576 36827-37606 Actual CpG Islands 18928-19547 25201-26371 36997-37693
    31. 31. Comparison of AL031723 Our Method Possible CpG Islands 19227-19435 25197-26147 36982-37420 Actual CpG Islands 18928-19547 25201-26371 36997-37693
    32. 32. Cons <ul><li>Traditional Method </li></ul><ul><ul><li>Criteria not stringent enough </li></ul></ul><ul><ul><li>If the expected part of the Obs/Exp ratio is unusually high then a high CpG count may not bring ratio above the cutoff </li></ul></ul><ul><li>Our Method </li></ul><ul><ul><li>Criteria sometimes too stringent </li></ul></ul>
    33. 33. Future Plans <ul><li>CpG Islands </li></ul><ul><li>Linkage Disequilibrium and SNPs </li></ul><ul><ul><li>Statistical analysis of the linkage disequilibrium coefficient </li></ul></ul><ul><ul><li>Kullback-Leibler Divergence II </li></ul></ul>
    34. 34. Thank you! <ul><li>Former researchers </li></ul><ul><ul><li>Andrew Dittmore </li></ul></ul><ul><ul><li>Yasuhiro Goda </li></ul></ul><ul><ul><li>Nick Heppenstall </li></ul></ul><ul><ul><li>Michal Dvir </li></ul></ul><ul><li>Deborah Lycan </li></ul><ul><li>John S. Rogers Program </li></ul>

    ×