0
Tools and challenges for
          ChIP-seq data analysis

Alba Jené Sanz
Biomedical Genomics Lab (UPF)
Overview


1. ChIP-seq – The basics
2. Typical pipeline
3. Challenges in ChIP-seq data analysis
4. To take into account
5....
1. ChIP-seq – The Basics
1. ChIP-seq – The Basics




     ChIP-on-chip


ChIP-seq
1. ChIP-seq – The Basics




     ChIP-on-chip
                    Bioinformatics
ChIP-seq
1. ChIP-seq – The Basics




35 bp
           500 bp   35 bp
1. ChIP-seq – The Basics
2. Typical pipeline
2. Typical pipeline
2. Typical pipeline


             Bowtie
2. Typical pipeline


             Bowtie   MACS
2. Typical pipeline


             Bowtie   MACS




      CEAS
2. Typical pipeline

Mapping…

   Unique / multiple locations
   Allowing mismatches – seed sequence
   Balance accuracy /...
2. Typical pipeline
3. Challenges in ChIP-seq data analysis


Millions of segments that need a fast mapping to the genome (allowing
mismatches...
4. To take into account
Transcription Factors vs Nucleosomes / Histone modifications

Control available?


Sequencing dept...
4. To take into account




There are many tools for the analysis of ChIP-
seq data, but no standards yet
5. Available tools
5. Available tools
5. Available tools
5. Available tools
5. Available tools




Uses regional averaging to mitigate sample fluctuations in the control library

      Uses the cont...
6. Analysis example
 QSEQ files (Solexa's FASTQ with ASCII Phred64, the 3rd FASTQ type)
lane5_SNAIL_F9_qseq.txt

SOLEXA   ...
6. Analysis example
 QSEQ files (Solexa's FASTQ with ASCII Phred64, the 3rd FASTQ type)
lane5_SNAIL_F9_qseq.txt

SOLEXA   ...
6. Analysis example
 QSEQ files (Solexa's FASTQ with ASCII Phred64, the 3rd FASTQ type)
lane5_SNAIL_F9_qseq.txt

SOLEXA   ...
6. Analysis example
SNAIL_F9.bwt
5_1_0_1409     +   gi|51511750|ref|NC_000021.7|NC_000021    34604194        AGTTGCACCTTTA...
6. Analysis example
SNAIL_F9.bwt
5_1_0_1409     +   gi|51511750|ref|NC_000021.7|NC_000021    34604194        AGTTGCACCTTTA...
6. Analysis example
SNAIL_F9.bwt
5_1_0_1409     +   gi|51511750|ref|NC_000021.7|NC_000021    34604194        AGTTGCACCTTTA...
6. Analysis example


                MACS pipeline


      Output:

      - Peak locations in BED and XLS format (genome ...
6. Analysis example


H3K27me3
                      PolII
6. Analysis example

snail_mfold_15_tsize41_newbwt_peaks.bed
track name="MACS peaks for snail_mfold_15_tsize41_newbwt"
chr...
6. Analysis example

snail_mfold_15_tsize41_newbwt_peaks.bed
track name="MACS peaks for snail_mfold_15_tsize41_newbwt"
chr...
6. Analysis example




Input:

-BED format peak locations

- Optional signal profile in wiggle format

- BED format extra...
CEAS output
CEAS output
CEAS output
CEAS output
CEAS output
CEAS output
7. Future challenges



Re-analyze data with new algorithms – sequences remain the same

ChIP-seq combined with Chromatin ...
8. Where to look for help...
Seqanswers.com
8. Where to look for help...
Seqanswers.com




Google groups, mailing lists of each project

                            ...
8. Where to look for help...
Seqanswers.com




Google groups, mailing lists of each project

                            ...
20091110 Technical Seminar  ChIP-seq Data Analysis
Upcoming SlideShare
Loading in...5
×

20091110 Technical Seminar ChIP-seq Data Analysis

4,041

Published on

Published in: Education, Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,041
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
260
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "20091110 Technical Seminar ChIP-seq Data Analysis"

  1. 1. Tools and challenges for ChIP-seq data analysis Alba Jené Sanz Biomedical Genomics Lab (UPF)
  2. 2. Overview 1. ChIP-seq – The basics 2. Typical pipeline 3. Challenges in ChIP-seq data analysis 4. To take into account 5. Available tools 6. Analysis example 7. Future Challenges 8. Where to look for help
  3. 3. 1. ChIP-seq – The Basics
  4. 4. 1. ChIP-seq – The Basics ChIP-on-chip ChIP-seq
  5. 5. 1. ChIP-seq – The Basics ChIP-on-chip Bioinformatics ChIP-seq
  6. 6. 1. ChIP-seq – The Basics 35 bp 500 bp 35 bp
  7. 7. 1. ChIP-seq – The Basics
  8. 8. 2. Typical pipeline
  9. 9. 2. Typical pipeline
  10. 10. 2. Typical pipeline Bowtie
  11. 11. 2. Typical pipeline Bowtie MACS
  12. 12. 2. Typical pipeline Bowtie MACS CEAS
  13. 13. 2. Typical pipeline Mapping… Unique / multiple locations Allowing mismatches – seed sequence Balance accuracy / performance Peak calling…
  14. 14. 2. Typical pipeline
  15. 15. 3. Challenges in ChIP-seq data analysis Millions of segments that need a fast mapping to the genome (allowing mismatches or gaps, performance issues) Peak detection – find the exact binding site Data normalization – compare results, background noise Visualization – thousands of enriched regions. UCSC, JBrowse…
  16. 16. 4. To take into account Transcription Factors vs Nucleosomes / Histone modifications Control available? Sequencing depth bias in Control vs IP Different alignment methods produce different peak calling results, but the difference is not as much as the one due to different peak caller or replicate Many differences on peak callers can be explained by the different thresholds used Some peak callers may be specific to some data types Consistency may be used to set threshold if replicates are available
  17. 17. 4. To take into account There are many tools for the analysis of ChIP- seq data, but no standards yet
  18. 18. 5. Available tools
  19. 19. 5. Available tools
  20. 20. 5. Available tools
  21. 21. 5. Available tools
  22. 22. 5. Available tools Uses regional averaging to mitigate sample fluctuations in the control library Uses the control to model the distribution across the genome using the Poisson distribution (BG). After identifying candidate peaks significantly enriched over the BG, a local labda is estimated using windows around each peak to eliminate local biases Open-source, open to contributions (Artistic License) and being actively improved Easy to use and fast-responding developers Compares very well to other methods
  23. 23. 6. Analysis example QSEQ files (Solexa's FASTQ with ASCII Phred64, the 3rd FASTQ type) lane5_SNAIL_F9_qseq.txt SOLEXA 90320 5 1 0 476 0 1 .ACGGGGGAGGG.C...CAAC..A...C............ BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 1222 0 1 .AATTGAAAAAT.A..TTTAA..G...A............ DO[[XVX[BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 133 0 1 .CCAGTCTATTAATT.TTGCC..GA..C............ DPXXXYYYYYYBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 145 0 1 .ATTGTTTCTGACTA.TTGAT..GC..T............ BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 153 0 1 .ACCGCTATCAGTAC.TAGCT..GT..A............ DMUYUVWBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 215 0 1 .TGTTGCCATTGCTA.AGGCA..GT..T............ DOVWBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 1 827 0 1 AGGAGATCGGCCGGTTGATGAGCCGAGTG........... Z__U_]PXYXTGRZ]QXBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 1 56 0 1 AAAAATCGACGCTCAAGTCAGAGGTGGCG........... ababba`_aabaab`UbaBBBBBBBBBBBBBBBBBBB 1 SOLEXA 90320 5 1 1 925 0 1 TGCAGCACTGGGGCCAGATGGTAAGCCCT........... _Z_T]`]M]OLP^^[`WBBBBBBBBBBBBBBBBBBBB 1 SOLEXA 90320 5 1 2 1637 0 1 GGGCTTCTGCCCCGGTGGGTACATGAGTA........... aaa`a`a`aa`aX_`^^^[``BBBBBBBBBBBBBBBBBB 1 @3_1_3_89 CACAGTGTCCTCCAGGTTCATCCC................ +3_1_3_89 @FC30C11AAXX:8:1:1649:1790 abbab^aaaaaaaVUVbBBBBBBBBBBBBBBBBBBBBBB GAAAAGTATTTGCAATTTGTTGCCTCTCATCCAAGAATGAAATTCCTATTG @3_1_3_762 + GCAAACAAATGGCGGAAAGCGGCG................ <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<:::::6776666666566663 +3_1_3_762 @FC30C11AAXX:8:1:1655:1811 aabba`b`a`]]TXOY`BBBBBBBBBBBBBBBBBBBBBBB GAAGCAGAAGCCTATATACCCTGTAGAACTGGGAGCCAATTAAACCTCTTT @3_1_90_512 + GCTGAGGCAGGAGAATTGCTTGAACTGGGAAGGCAGAGGT <<<<<<<<<<<<<<<<<<5<<<<><<:>:<<;<:;3665537.6+6.33.+ +3_1_90_512 @FC30C11AAXX:8:1:1609:1848 ab`a_a`X``WTGW]T]SZ]T[aXa_T^]XP]H_VXY GATGTGGTTTCACATAAATTGACATATATAGTTCCAGGCTGTAAATGTTGT @3_1_90_1028 + GTTACGGCTTATCCTGCACATTACGACCGTTTGCGTAACG <<;<<;;+<<7:<<<<<<7::7:<<<<<<:7,777402-4.-+*20+0-%- +3_1_90_1028 @FC30C11AAXX:8:1:1667:1880 `bba`X_^ab_aWS`_b[`aa_]TZ^VYa`VW^`^b`a GTTTTATACAAATCAAAACCATAGTGAGATACCATCTCACACTAGTCAGAA @3_1_90_1651 + TAATTTTAGATTTTATCCTTGACATTGTAAATATTACATT <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<:::::6767366767357566 +3_1_90_1651 @FC30C11AAXX:8:1:1577:1853 aUVF[aa`VU_`aaU[__aaaYV^aP`aQUa`_^_a GTGATGGGAGGAAAGCTAGGGGGCTATAATGTCTATTACAAGGCTCAGTAG @3_1_90_1670 + <6<<<<<<<<<<<<<<<<<<<<<<<<<<<<:99:9646066,6+0604044
  24. 24. 6. Analysis example QSEQ files (Solexa's FASTQ with ASCII Phred64, the 3rd FASTQ type) lane5_SNAIL_F9_qseq.txt SOLEXA 90320 5 1 0 476 0 1 .ACGGGGGAGGG.C...CAAC..A...C............ BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 1222 0 1 .AATTGAAAAAT.A..TTTAA..G...A............ DO[[XVX[BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 133 0 1 .CCAGTCTATTAATT.TTGCC..GA..C............ DPXXXYYYYYYBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 145 0 1 .ATTGTTTCTGACTA.TTGAT..GC..T............ BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 153 0 1 .ACCGCTATCAGTAC.TAGCT..GT..A............ DMUYUVWBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 215 0 1 .TGTTGCCATTGCTA.AGGCA..GT..T............ DOVWBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 1 827 0 1 AGGAGATCGGCCGGTTGATGAGCCGAGTG........... Z__U_]PXYXTGRZ]QXBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 1 56 0 1 AAAAATCGACGCTCAAGTCAGAGGTGGCG........... ababba`_aabaab`UbaBBBBBBBBBBBBBBBBBBB 1 SOLEXA 90320 5 1 1 925 0 1 TGCAGCACTGGGGCCAGATGGTAAGCCCT........... _Z_T]`]M]OLP^^[`WBBBBBBBBBBBBBBBBBBBB 1 SOLEXA 90320 5 1 2 1637 0 1 GGGCTTCTGCCCCGGTGGGTACATGAGTA........... aaa`a`a`aa`aX_`^^^[``BBBBBBBBBBBBBBBBBB 1 Filter qualities and parse @3_1_3_89 CACAGTGTCCTCCAGGTTCATCCC................ +3_1_3_89 @FC30C11AAXX:8:1:1649:1790 abbab^aaaaaaaVUVbBBBBBBBBBBBBBBBBBBBBBB GAAAAGTATTTGCAATTTGTTGCCTCTCATCCAAGAATGAAATTCCTATTG @3_1_3_762 + GCAAACAAATGGCGGAAAGCGGCG................ <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<:::::6776666666566663 +3_1_3_762 @FC30C11AAXX:8:1:1655:1811 aabba`b`a`]]TXOY`BBBBBBBBBBBBBBBBBBBBBBB GAAGCAGAAGCCTATATACCCTGTAGAACTGGGAGCCAATTAAACCTCTTT @3_1_90_512 + GCTGAGGCAGGAGAATTGCTTGAACTGGGAAGGCAGAGGT <<<<<<<<<<<<<<<<<<5<<<<><<:>:<<;<:;3665537.6+6.33.+ +3_1_90_512 @FC30C11AAXX:8:1:1609:1848 ab`a_a`X``WTGW]T]SZ]T[aXa_T^]XP]H_VXY GATGTGGTTTCACATAAATTGACATATATAGTTCCAGGCTGTAAATGTTGT @3_1_90_1028 + GTTACGGCTTATCCTGCACATTACGACCGTTTGCGTAACG <<;<<;;+<<7:<<<<<<7::7:<<<<<<:7,777402-4.-+*20+0-%- +3_1_90_1028 @FC30C11AAXX:8:1:1667:1880 `bba`X_^ab_aWS`_b[`aa_]TZ^VYa`VW^`^b`a GTTTTATACAAATCAAAACCATAGTGAGATACCATCTCACACTAGTCAGAA @3_1_90_1651 + TAATTTTAGATTTTATCCTTGACATTGTAAATATTACATT <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<:::::6767366767357566 +3_1_90_1651 @FC30C11AAXX:8:1:1577:1853 aUVF[aa`VU_`aaU[__aaaYV^aP`aQUa`_^_a GTGATGGGAGGAAAGCTAGGGGGCTATAATGTCTATTACAAGGCTCAGTAG @3_1_90_1670 + <6<<<<<<<<<<<<<<<<<<<<<<<<<<<<:99:9646066,6+0604044
  25. 25. 6. Analysis example QSEQ files (Solexa's FASTQ with ASCII Phred64, the 3rd FASTQ type) lane5_SNAIL_F9_qseq.txt SOLEXA 90320 5 1 0 476 0 1 .ACGGGGGAGGG.C...CAAC..A...C............ BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 1222 0 1 .AATTGAAAAAT.A..TTTAA..G...A............ DO[[XVX[BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 133 0 1 .CCAGTCTATTAATT.TTGCC..GA..C............ DPXXXYYYYYYBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 145 0 1 .ATTGTTTCTGACTA.TTGAT..GC..T............ BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 153 0 1 .ACCGCTATCAGTAC.TAGCT..GT..A............ DMUYUVWBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 0 215 0 1 .TGTTGCCATTGCTA.AGGCA..GT..T............ DOVWBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 1 827 0 1 AGGAGATCGGCCGGTTGATGAGCCGAGTG........... Z__U_]PXYXTGRZ]QXBBBBBBBBBBBBBBBBBBBBBB 0 SOLEXA 90320 5 1 1 56 0 1 AAAAATCGACGCTCAAGTCAGAGGTGGCG........... ababba`_aabaab`UbaBBBBBBBBBBBBBBBBBBB 1 SOLEXA 90320 5 1 1 925 0 1 TGCAGCACTGGGGCCAGATGGTAAGCCCT........... _Z_T]`]M]OLP^^[`WBBBBBBBBBBBBBBBBBBBB 1 SOLEXA 90320 5 1 2 1637 0 1 GGGCTTCTGCCCCGGTGGGTACATGAGTA........... aaa`a`a`aa`aX_`^^^[``BBBBBBBBBBBBBBBBBB 1 Filter qualities and parse @3_1_3_89 CACAGTGTCCTCCAGGTTCATCCC................ +3_1_3_89 @FC30C11AAXX:8:1:1649:1790 abbab^aaaaaaaVUVbBBBBBBBBBBBBBBBBBBBBBB GAAAAGTATTTGCAATTTGTTGCCTCTCATCCAAGAATGAAATTCCTATTG @3_1_3_762 + GCAAACAAATGGCGGAAAGCGGCG................ <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<:::::6776666666566663 +3_1_3_762 @FC30C11AAXX:8:1:1655:1811 aabba`b`a`]]TXOY`BBBBBBBBBBBBBBBBBBBBBBB GAAGCAGAAGCCTATATACCCTGTAGAACTGGGAGCCAATTAAACCTCTTT @3_1_90_512 + GCTGAGGCAGGAGAATTGCTTGAACTGGGAAGGCAGAGGT <<<<<<<<<<<<<<<<<<5<<<<><<:>:<<;<:;3665537.6+6.33.+ +3_1_90_512 @FC30C11AAXX:8:1:1609:1848 ab`a_a`X``WTGW]T]SZ]T[aXa_T^]XP]H_VXY @3_1_90_1028 GTTACGGCTTATCCTGCACATTACGACCGTTTGCGTAACG BOWTIE GATGTGGTTTCACATAAATTGACATATATAGTTCCAGGCTGTAAATGTTGT + <<;<<;;+<<7:<<<<<<7::7:<<<<<<:7,777402-4.-+*20+0-%- +3_1_90_1028 @FC30C11AAXX:8:1:1667:1880 `bba`X_^ab_aWS`_b[`aa_]TZ^VYa`VW^`^b`a GTTTTATACAAATCAAAACCATAGTGAGATACCATCTCACACTAGTCAGAA @3_1_90_1651 + TAATTTTAGATTTTATCCTTGACATTGTAAATATTACATT <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<:::::6767366767357566 +3_1_90_1651 @FC30C11AAXX:8:1:1577:1853 aUVF[aa`VU_`aaU[__aaaYV^aP`aQUa`_^_a GTGATGGGAGGAAAGCTAGGGGGCTATAATGTCTATTACAAGGCTCAGTAG @3_1_90_1670 + <6<<<<<<<<<<<<<<<<<<<<<<<<<<<<:99:9646066,6+0604044
  26. 26. 6. Analysis example SNAIL_F9.bwt 5_1_0_1409 + gi|51511750|ref|NC_000021.7|NC_000021 34604194 AGTTGCACCTTTAACAATTTCCCAT %/6::9::;;;;7279######### 0 17:G>T,24:G>T 5_1_0_811 + gi|89161218|ref|NC_000023.9|NC_000023 77246408 TTCTGCAAGCCTCCGGAGCGCACGTG BBB@5<?=9<9@>96/:0######## 0 25:C>G 5_1_1_1665 + gi|89161199|ref|NC_000002.10|NC_000002 201785208 GCCCAGCTGTCACTGTGGTTTTGATTTGC BBCCCBBBCBBB@BABBBACCA####### 0 5_1_2_1637 + gi|51511731|ref|NC_000015.8|NC_000015 92942360 GGGCTTCTGCCCCGGTGGGTACATGAGTA BBBABABABBAB9@A??=?<AA####### 0 5_1_2_1359 + gi|89161205|ref|NC_000003.10|NC_000003 101351498 CAATTCCCTCCTTGAAAGGCTCCTCCACC BCCBBBBAAAABA9@B?59@ABA###### 0 5_1_2_730 - gi|51511721|ref|NC_000005.8|NC_000005 1314600 GGACTTCCATGCAAACAAGCTGCTTTCCA ########BB>9@B@;@B<;??ABCBABB 0 5_1_2_1118 - gi|89161213|ref|NC_000007.12|NC_000007 157199758 CATCTTTGATGAGTTACTACCTGTGGGGT ########@B@?=B@;8@659@@BAABAB 0 5_1_3_920 + gi|51511727|ref|NC_000011.8|NC_000011 133317176 GGTAGACTCACAAAACTACCAAAGTCCTCTAC ABABAABCBBBBCBCBCBBBCCA>@@###### 0 5_1_3_971 + gi|89161190|ref|NC_000012.10|NC_000012 7497006 TTTTCATGCAGCCCGAGACATCAAGCTAGCAG B@86646330/250################## 0 31:T>G
  27. 27. 6. Analysis example SNAIL_F9.bwt 5_1_0_1409 + gi|51511750|ref|NC_000021.7|NC_000021 34604194 AGTTGCACCTTTAACAATTTCCCAT %/6::9::;;;;7279######### 0 17:G>T,24:G>T 5_1_0_811 + gi|89161218|ref|NC_000023.9|NC_000023 77246408 TTCTGCAAGCCTCCGGAGCGCACGTG BBB@5<?=9<9@>96/:0######## 0 25:C>G 5_1_1_1665 + gi|89161199|ref|NC_000002.10|NC_000002 201785208 GCCCAGCTGTCACTGTGGTTTTGATTTGC BBCCCBBBCBBB@BABBBACCA####### 0 5_1_2_1637 + gi|51511731|ref|NC_000015.8|NC_000015 92942360 GGGCTTCTGCCCCGGTGGGTACATGAGTA BBBABABABBAB9@A??=?<AA####### 0 5_1_2_1359 + gi|89161205|ref|NC_000003.10|NC_000003 101351498 CAATTCCCTCCTTGAAAGGCTCCTCCACC BCCBBBBAAAABA9@B?59@ABA###### 0 5_1_2_730 - gi|51511721|ref|NC_000005.8|NC_000005 1314600 GGACTTCCATGCAAACAAGCTGCTTTCCA ########BB>9@B@;@B<;??ABCBABB 0 5_1_2_1118 - gi|89161213|ref|NC_000007.12|NC_000007 157199758 CATCTTTGATGAGTTACTACCTGTGGGGT ########@B@?=B@;8@659@@BAABAB 0 5_1_3_920 + gi|51511727|ref|NC_000011.8|NC_000011 133317176 GGTAGACTCACAAAACTACCAAAGTCCTCTAC ABABAABCBBBBCBCBCBBBCCA>@@###### 0 5_1_3_971 + gi|89161190|ref|NC_000012.10|NC_000012 7497006 TTTTCATGCAGCCCGAGACATCAAGCTAGCAG B@86646330/250################## 0 31:T>G Parsing SNAIL_F9.bwt.bed chr21 34604194 34604219 5_1_0_1409 . + chr23 77246408 77246434 5_1_0_811 . + chr02 201785208 201785237 5_1_1_1665 . + chr15 92942360 92942389 5_1_2_1637 . + chr03 101351498 101351527 5_1_2_1359 . + chr05 1314600 1314629 5_1_2_730 . - chr07 157199758 157199787 5_1_2_1118 . - chr11 133317176 133317208 5_1_3_920 . + chr12 7497006 7497038 5_1_3_971 . + chr01 201404048 201404081 5_1_3_1986 . +
  28. 28. 6. Analysis example SNAIL_F9.bwt 5_1_0_1409 + gi|51511750|ref|NC_000021.7|NC_000021 34604194 AGTTGCACCTTTAACAATTTCCCAT %/6::9::;;;;7279######### 0 17:G>T,24:G>T 5_1_0_811 + gi|89161218|ref|NC_000023.9|NC_000023 77246408 TTCTGCAAGCCTCCGGAGCGCACGTG BBB@5<?=9<9@>96/:0######## 0 25:C>G 5_1_1_1665 + gi|89161199|ref|NC_000002.10|NC_000002 201785208 GCCCAGCTGTCACTGTGGTTTTGATTTGC BBCCCBBBCBBB@BABBBACCA####### 0 5_1_2_1637 + gi|51511731|ref|NC_000015.8|NC_000015 92942360 GGGCTTCTGCCCCGGTGGGTACATGAGTA BBBABABABBAB9@A??=?<AA####### 0 5_1_2_1359 + gi|89161205|ref|NC_000003.10|NC_000003 101351498 CAATTCCCTCCTTGAAAGGCTCCTCCACC BCCBBBBAAAABA9@B?59@ABA###### 0 5_1_2_730 - gi|51511721|ref|NC_000005.8|NC_000005 1314600 GGACTTCCATGCAAACAAGCTGCTTTCCA ########BB>9@B@;@B<;??ABCBABB 0 5_1_2_1118 - gi|89161213|ref|NC_000007.12|NC_000007 157199758 CATCTTTGATGAGTTACTACCTGTGGGGT ########@B@?=B@;8@659@@BAABAB 0 5_1_3_920 + gi|51511727|ref|NC_000011.8|NC_000011 133317176 GGTAGACTCACAAAACTACCAAAGTCCTCTAC ABABAABCBBBBCBCBCBBBCCA>@@###### 0 5_1_3_971 + gi|89161190|ref|NC_000012.10|NC_000012 7497006 TTTTCATGCAGCCCGAGACATCAAGCTAGCAG B@86646330/250################## 0 31:T>G Parsing SNAIL_F9.bwt.bed chr21 34604194 34604219 5_1_0_1409 . + chr23 77246408 77246434 5_1_0_811 . + chr02 201785208 201785237 5_1_1_1665 . + chr15 92942360 92942389 5_1_2_1637 . + chr03 101351498 101351527 5_1_2_1359 . + chr05 1314600 1314629 5_1_2_730 . - chr07 chr11 157199758 133317176 157199787 133317208 5_1_2_1118 5_1_3_920 . . MACS - + chr12 7497006 7497038 5_1_3_971 . + chr01 201404048 201404081 5_1_3_1986 . +
  29. 29. 6. Analysis example MACS pipeline Output: - Peak locations in BED and XLS format (genome browser) - Tag count in wiggle format (genome browser) - Bimodal model in R scripts
  30. 30. 6. Analysis example H3K27me3 PolII
  31. 31. 6. Analysis example snail_mfold_15_tsize41_newbwt_peaks.bed track name="MACS peaks for snail_mfold_15_tsize41_newbwt" chr1 559644 559924 MACS_peak_1 79.29 chr1 2435221 2435542 MACS_peak_2 51.58 chr1 14624217 14624571 MACS_peak_3 66.12 chr1 15610639 15611000 MACS_peak_4 56.69 chr1 16822564 16822753 MACS_peak_5 52.84 chr1 18411948 18412187 MACS_peak_6 82.46 chr1 22857612 22857985 MACS_peak_7 88.74 chr1 27541904 27542134 MACS_peak_8 69.47 snail_mfold_15_MACS.wig track type=wiggle_0 name="MACS_counts_after_shifting" description="Shifted Merged MACS tag counts for every 10 bp" variableStep chrom=chr10 span=10 85171 1 85181 1 85191 1 85201 1 85211 1 85221 1 85231 2 85371 2
  32. 32. 6. Analysis example snail_mfold_15_tsize41_newbwt_peaks.bed track name="MACS peaks for snail_mfold_15_tsize41_newbwt" chr1 559644 559924 MACS_peak_1 79.29 chr1 2435221 2435542 MACS_peak_2 51.58 chr1 14624217 14624571 MACS_peak_3 66.12 chr1 15610639 15611000 MACS_peak_4 56.69 chr1 16822564 16822753 MACS_peak_5 52.84 chr1 18411948 18412187 MACS_peak_6 82.46 chr1 22857612 22857985 MACS_peak_7 88.74 chr1 27541904 27542134 MACS_peak_8 69.47 snail_mfold_15_MACS.wig track type=wiggle_0 name="MACS_counts_after_shifting" description="Shifted Merged MACS tag counts for every 10 bp" variableStep chrom=chr10 span=10 85171 1 85181 1 85191 1 85201 1 85211 1 CEAS 85221 1 85231 2 85371 2
  33. 33. 6. Analysis example Input: -BED format peak locations - Optional signal profile in wiggle format - BED format extra regions of interest
  34. 34. CEAS output
  35. 35. CEAS output
  36. 36. CEAS output
  37. 37. CEAS output
  38. 38. CEAS output
  39. 39. CEAS output
  40. 40. 7. Future challenges Re-analyze data with new algorithms – sequences remain the same ChIP-seq combined with Chromatin Conformation Capture (3C) – long-range physical interactions Technical improvements: RNA-seq will benefit from longer reads Integrated computational analyses – integration of TF, histone marks, methylation, polymerase loading to predict regulatory output
  41. 41. 8. Where to look for help... Seqanswers.com
  42. 42. 8. Where to look for help... Seqanswers.com Google groups, mailing lists of each project MACS CEAS FindPeaks
  43. 43. 8. Where to look for help... Seqanswers.com Google groups, mailing lists of each project MACS CEAS FindPeaks Lab mates!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×