SlideShare a Scribd company logo
genomation
package

genomation

Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries

a toolkit to summarize, annotate and visualize genomic intervals

Using
genomation

Altuna Akalın1

More
information

February 24, 2014
1*

presented by. Package developed by Altuna Akalın and Vedran Franke
Quick introduction
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation
More
information

The genomation is an R package that expedites genomic
interval summary and annotation. It has the following features
1

Annotation of genomic intervals: e.g. see what % of your
intervals overlap with exon/intron/promoters
Quick introduction
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation
More
information

The genomation is an R package that expedites genomic
interval summary and annotation. It has the following features
1

2

Annotation of genomic intervals: e.g. see what % of your
intervals overlap with exon/intron/promoters
Summary of genomic scores or read coverages over pre-defined
regions
e.g. extract the conservation profile over ChIP-seq binding sites
(equi-width regions) or CpG islands (nonequi-width regions)
Quick introduction
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation

The genomation is an R package that expedites genomic
interval summary and annotation. It has the following features
1

2

More
information

Annotation of genomic intervals: e.g. see what % of your
intervals overlap with exon/intron/promoters
Summary of genomic scores or read coverages over pre-defined
regions
e.g. extract the conservation profile over ChIP-seq binding sites
(equi-width regions) or CpG islands (nonequi-width regions)

3

Visualize genomic interval summaries as meta-region plots or
heatmaps.
Quick introduction
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation

The genomation is an R package that expedites genomic
interval summary and annotation. It has the following features
1

2

More
information

Annotation of genomic intervals: e.g. see what % of your
intervals overlap with exon/intron/promoters
Summary of genomic scores or read coverages over pre-defined
regions
e.g. extract the conservation profile over ChIP-seq binding sites
(equi-width regions) or CpG islands (nonequi-width regions)

3

4

Visualize genomic interval summaries as meta-region plots or
heatmaps.
Work with multiple file formats
e.g. BAM, BED, bigWig, GFF and generic tabular text files
containing chromosome location information.
Quick introduction
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation

The genomation is an R package that expedites genomic
interval summary and annotation. It has the following features
1

2

More
information

Annotation of genomic intervals: e.g. see what % of your
intervals overlap with exon/intron/promoters
Summary of genomic scores or read coverages over pre-defined
regions
e.g. extract the conservation profile over ChIP-seq binding sites
(equi-width regions) or CpG islands (nonequi-width regions)

3

4

Visualize genomic interval summaries as meta-region plots or
heatmaps.
Work with multiple file formats
e.g. BAM, BED, bigWig, GFF and generic tabular text files
containing chromosome location information.

5

do all these in R :)
Genomic interval summaries are widely used
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation
More
information

Summaries of genomic intervals are one of the useful ways to
communicate high-dimensional data
Traditionally, regions of interest are picked and distribution of
genomic intervals are summarized on those regions
Genomic interval summaries are widely used:
Examples from literature
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation
More
information

Figure : Erkek, S., et al. (2013). Molecular determinants of nucleosome
retention at CpG-rich sequences in mouse spermatozoa. Nature Structural
& Molecular Biology
Genomic interval summaries are widely used:
Examples from literature
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation
More
information

Figure : Stadler, M., Murr, R., Burger, L., et al. (2011). DNA-binding
factors shape the mouse methylome at distal regulatory regions. Nature
Utility and futility of average profiles
genomation
package

average profile around anchor

4.5

average score

More
information

5.0

Using
genomation

4.0

Usage and
ubiquity of
genomic
interval
summaries

Does this mean all of the windows (viewpoints) have a similar
enrichment profile?

3.5

Altuna Akalın

−100

0
bases

50

100
Utility and futility of average profiles
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries

Only 1/3 of windows have such enrichment. Be careful when you are
interpreting the average profiles.

21

Using
genomation

0

5.2

10

16

More
information

−100

0

50 100
Genomic interval summaries are widely used:
Examples from literature
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation
More
information

Figure : Lister, R., et al. (2009). Human DNA methylomes at base
resolution show widespread epigenomic differences. Nature
Genomic interval summaries are widely used:
Examples from literature
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation
More
information

Figure : Feng, S. et al. (2010). Conservation and divergence of
methylation patterning in plants and animals. PNAS
Issues to keep in mind when developing summary
methods
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation
More
information

Genomic data comes in many formats, we need a method that is
able to work with multiple flat file formats
We need a method that is not specialized on one type of data
set such as read counts, it should also work on other scoring
schemes(e.g. conservation scores) easily.
Regions of interest are not always equi-width, you should be able
to normalize for length differences by binning.
Multiple visualization options and fast heatmap generation
should be available
Clustering of regions based on multiple summaries (e.g.
binding for different TFs on the same set of regions) on the
heatmap
Ease of use, it should not take hours of coding to generate and
visualize summaries.
Overview of genomation features
genomation
package
Altuna Akalın

region 2
region 3
region 4
...
region m

1.1

1.0
0.8

0.86

TF4
TF3

0.6

0.6

TF2
0.34

TF1

0.072

.
.
.
.
. . . . . .
.

TF1

0

0.2

region 1

TF3
TF2

0.4

read per million

1 2 3 4 ... n

0

500

1000

base-pairs around anchor

heatmaps for genomic interval sets
TF 4

TF 3

TF 2

TF 1

Visualize

0 0.5 1 1.5 2 2.5

0
100

0

0

0

0 0.5 1 1.5 2 2.5

500

500

100

0

0
100

0

0 0.5 1 1.5 2

0

Annotate
500

Annotation
BED
GFF
Tab txt
GRanges

500

1000

base-pairs around anchor

500

More
information

Genomic Intervals
BAM
BigWig
BED
GFF
Summarize
Tab txt
GRanges

meta-region heatmaps

TF4

100

Using
genomation

meta-region plots

ScoreMatrix/ScoreMatrixList object
Base-pairs/ bins

0.0

Usage and
ubiquity of
genomic
interval
summaries

0 0.5 1 1.5 2 2.5

Piecharts for annotation

25.7

21.8
11.6

40.9

Intergenic
Intron
Exon
Promoter
installation of the package and the example data
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation
More
information

We can install the package and the data using install_github()
function from the devtools package.
#install dependencies
install.packages( c("data.table","plyr","reshape2","ggplot2",
"gridBase","devtools"))
source("http://bioconductor.org/biocLite.R")
biocLite(c("GenomicRanges","rtracklayer","impute","Rsamtools"))
# install the packages
library(devtools)
install_github("genomation", username = "al2na")
# install the data package
# needed for examples
install_github("genomationData", username = "al2na")
Data import
genomation
package
Altuna Akalın

Various file formats can be used in genomation. You can read in
annotation or your genomic intervals of interest.

Usage and
ubiquity of
genomic
interval
summaries

library(genomation)
tab.file1 <- system.file("extdata/tab1.bed", package = "genomation")
readGeneric(tab.file1)

Using
genomation
More
information

## GRanges with 6
##
seqnames
##
<Rle>
##
[1]
chr21
##
[2]
chr21
##
[3]
chr21
##
[4]
chr21
##
[5]
chr21
##
[6]
chr21
##
--##
seqlengths:
##
chr21
##
NA

ranges and 0 metadata columns:
ranges strand
<IRanges> <Rle>
[9437272, 9439473]
*
[9483485, 9484663]
*
[9647866, 9648116]
*
[9708935, 9709231]
*
[9825442, 9826296]
*
[9909011, 9909218]
*
Extraction of data over pre-defined genomic regions
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries

ScoreMatrix() and ScoreMatrixBin() are functions used to extract
data over predefined windows.
ScoreMatrix is used when all of the windows have the same
width (e.g. region around TSS)

Using
genomation

ScoreMatrixBin is designed for use with windows of unequal
width (e.g. enrichment of methylation over exons).

More
information

data(cage)
data(promoters)
sm <- ScoreMatrix(target = cage, windows = promoters)
sm
##

scoreMatrix with dims:

1055 2001
Visualizing ScoreMatrix: summary of genomic
invervals over pre-defined regions
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation

plotMeta(),heatMeta(), heatMatrix() and multiHeatMatrix()
are the visualization functions.
oldmar <- par()$mar
par(oma = c(0, 0, 0, 0))
heatMatrix(sm, xcoords = c(-1000, 1000))
plotMeta(sm, xcoords = c(-1000, 1000),line.col="blue")
par(oma = oldmar)

0.15

average score

0.10
0.05
0.00

0

0.75

1.5

2.2

3

0.20

0.25

More
information

−1000

−500

0

500

1000

−1000

−500

0
bases

500

1000
Working with BAM files
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation
More
information

BAM files can also be used in ScoreMatrix() and ScoreMatrixBin()
functions
bam.file = system.file('tests/test.bam', package='genomation')
windows = GRanges(rep(c(1,2),each=2),
IRanges(rep(c(1,2), times=2), width=5))
scores3 = ScoreMatrix(target=bam.file,windows=windows, type='bam')
Working with bigWig files

DHS around TSS

14

More
information

8 10

Using
genomation

my.bed12.file=system.file("extdata/chr21.refseq.hg19.bed",
package = "genomation")
feats=readTranscriptFeatures(my.bed12.file,up.flank=500,down.flank=500)
sm=ScoreMatrix(target="wgEncodeUwDnaseA549RawRep1.bw",
windows=feats$promoters,type='bigWig',strand.aware=TRUE)
plotMeta(sm,xcoords=c(-500,500),main="DHS around TSS",line.col="blue")

average score

Usage and
ubiquity of
genomic
interval
summaries

6

Altuna Akalın

ScoreMatrix() and ScoreMatrixBin() are functions can handle
bigWig files. Here we use ENCODE DHS scores, downloaded from
http://goo.gl/fEVu0g

4

genomation
package

−400

0

200
Multiple profiles

0

0

0

50

25

0
25

−2
50

50
−5 0
00

0

0

25

−2
50

50
−5 0
00

0

0

25

0

P300 Suz12 Rad21 Znf143

−2
50

0

0

−5
00

CTCF

50
−5 0
00

More
information

25

Using
genomation

ctcf.peaks=readRDS("ctcf.peaks.rds")
dataPath = system.file("extdata", package = "genomationData")
bam.files = list.files(dataPath, full= T,pattern = "bam$")[c(1:4,6)]
sml = ScoreMatrixList(bam.files, ctcf.peaks, bin.num = 50,type = "bam")
names(sml)=c("CTCF","P300","Suz12","Rad21","Znf143")
multiHeatMatrix(sml, xcoords = c(-500, 500),cex.axis=0.35,common.scale = T,
col = c("lightgray", "blue"),winsorize=c(0,95))

−2
50

Usage and
ubiquity of
genomic
interval
summaries

50
−5 0
00

Altuna Akalın

Multiple heatmap profiles can be plotted using multiHeatMatrix()
which takes in a ScoreMatrixList object. Here we used CTCF , P300
, Suz12 ,Rad21, Znf143 BAM files from genomationData package.

−2
50

genomation
package

02468 02468 02468 02468 02468
Multiple profiles
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation

multiHeatMatrix() can also apply K-means clustering. Extreme
values are trimmed using with “winsorize” argument
multiHeatMatrix(sml, xcoords = c(-500, 500),kmeans=TRUE,k=3,common.scale = T,
cex.axis=0.4,col = c("lightgray", "blue"),winsorize=c(0,95))
CTCF

P300 Suz12 Rad21 Znf143

1

More
information
2

0

0

0

50

25

0

25
0
−500
50
0
−2
50

0

25
0
−500
50
0
−2
50

0

25
0
−500
50
0
−2
50

0

50

0
−500
50
0
−2
50

25

−2

−5

00

3

02468 02468 02468 02468 02468
Multiple profiles
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation

Multiple average profiles can be visualized with heatMeta(). Here,
we also apply a scaling function to all the matrices.
# take log2 of all matrices
sml2=scaleScoreMatrixList(sml,scalefun=function(x) log2(x+1))
heatMeta(sml2,legend.name="average profiles",xcoords=c(-500, 500),
xlab="bp around peaks")

More
information

1.8

CTCF

average profiles
0.61
1
1.4

P300

Suz12

0.21

Rad21

Znf143

−400

−200

0
bp around peaks

200

400
Multiple profiles
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries

Multiple average profiles can also be visualized with plotMeta()
plotMeta(sml2,profile.names=names(sml2),
xcoords=c(-500, 500),
main="mult. profiles")

Using
genomation

mult. profiles

More
information

1.0
0.5

average score

1.5

CTCF
P300
Suz12
Rad21
Znf143

−400

−200

0
bases

200

400
Future work...
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation
More
information

Explore overlap statistics between two genomic data sets: Does
TF1 binding site locations overlap with TF2 sites more than
expected?
This is previously explored with GenometriCorr package. These
functionality can be included in the form of a dependency.
Performance improvement on certain functions, faster is always
better...
Further information
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation
More
information

The genomation package is available at
http:/al2na.github.io/genomation. You can find the link
to the vignette on the webpage as well.
Code that generated this presentation is available at
http://github.com/al2na/genomation_presentation
Questions and bug reports
You can view/open issues in github
https://github.com/al2na/genomation/issues?state=open
You can ask questions by sending an e-mail to
genomation@googlegroups.com or using the web interface to
google groups

Developed by Altuna Akalın and Vedran Franke
Session Info
genomation
package
Altuna Akalın
Usage and
ubiquity of
genomic
interval
summaries
Using
genomation
More
information

sessionInfo()
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] C
attached base packages:
[1] methods
grid
stats
[8] base

graphics

grDevices utils

other attached packages:
[1] genomation_0.99.0.2 knitr_1.5
loaded via a namespace (and not attached):
[1] BSgenome_1.30.0
BiocGenerics_0.8.0
[4] GenomicRanges_1.14.3 IRanges_1.20.5
[7] RColorBrewer_1.0-5
RCurl_1.95-4.1
[10] XML_3.95-0.2
XVector_0.2.0
[13] colorspace_1.2-4
data.table_1.8.10
[16] digest_0.6.3
evaluate_0.5.1
[19] ggplot2_0.9.3.1
gridBase_0.4-6

Biostrings_2.30.0
MASS_7.3-29
Rsamtools_1.14.1
bitops_1.0-6
dichromat_2.0-0
formatR_0.10
gtable_0.1.2

datas

More Related Content

Similar to genomation: summary of genomic intervals

20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
hansjansen9999
 
De Novo
De NovoDe Novo
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research Community
Monica Munoz-Torres
 
Bio305 genome analysis and annotation 2012
Bio305 genome analysis and annotation 2012Bio305 genome analysis and annotation 2012
Bio305 genome analysis and annotation 2012
Mark Pallen
 
Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing
Monica Munoz-Torres
 
Fans
FansFans
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical Notebook
Naima Tahsin
 
Lightning
LightningLightning
Lightning
Arvados
 
Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence data
Thomas Keane
 
Cufflinks
CufflinksCufflinks
Cufflinks
Ravi Gandham
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
Dan Gaston
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
GenomeInABottle
 
iMate Protocol Guide version 2.0
iMate Protocol Guide version 2.0iMate Protocol Guide version 2.0
iMate Protocol Guide version 2.0
Shigehiro Kuraku (工樂 樹洋)
 
Apollo : A workshop for the Manakin Research Coordination Network
Apollo: A workshop for the Manakin Research Coordination NetworkApollo: A workshop for the Manakin Research Coordination Network
Apollo : A workshop for the Manakin Research Coordination Network
Monica Munoz-Torres
 
Finding genes
Finding genesFinding genes
Finding genes
Sabahat Ali
 
iMate Protocol Guide version 2.1
iMate Protocol Guide version 2.1iMate Protocol Guide version 2.1
iMate Protocol Guide version 2.1
Shigehiro Kuraku (工樂 樹洋)
 
Bioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmmBioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmm
Prof. Wim Van Criekinge
 
Iplant pag
Iplant pagIplant pag
Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishing
Nikolay Vyahhi
 
Genome sequencing. ppt.pptx
Genome sequencing. ppt.pptxGenome sequencing. ppt.pptx
Genome sequencing. ppt.pptx
GedifewGebrie
 

Similar to genomation: summary of genomic intervals (20)

20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
De Novo
De NovoDe Novo
De Novo
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research Community
 
Bio305 genome analysis and annotation 2012
Bio305 genome analysis and annotation 2012Bio305 genome analysis and annotation 2012
Bio305 genome analysis and annotation 2012
 
Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing
 
Fans
FansFans
Fans
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical Notebook
 
Lightning
LightningLightning
Lightning
 
Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence data
 
Cufflinks
CufflinksCufflinks
Cufflinks
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
iMate Protocol Guide version 2.0
iMate Protocol Guide version 2.0iMate Protocol Guide version 2.0
iMate Protocol Guide version 2.0
 
Apollo : A workshop for the Manakin Research Coordination Network
Apollo: A workshop for the Manakin Research Coordination NetworkApollo: A workshop for the Manakin Research Coordination Network
Apollo : A workshop for the Manakin Research Coordination Network
 
Finding genes
Finding genesFinding genes
Finding genes
 
iMate Protocol Guide version 2.1
iMate Protocol Guide version 2.1iMate Protocol Guide version 2.1
iMate Protocol Guide version 2.1
 
Bioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmmBioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmm
 
Iplant pag
Iplant pagIplant pag
Iplant pag
 
Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishing
 
Genome sequencing. ppt.pptx
Genome sequencing. ppt.pptxGenome sequencing. ppt.pptx
Genome sequencing. ppt.pptx
 

More from Altuna Akalin

Deep learning-based cancer patient stratification
Deep learning-based cancer patient stratification Deep learning-based cancer patient stratification
Deep learning-based cancer patient stratification
Altuna Akalin
 
Data analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomicsData analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomics
Altuna Akalin
 
computational genomics course 2018 berlin
computational genomics course 2018 berlincomputational genomics course 2018 berlin
computational genomics course 2018 berlin
Altuna Akalin
 
Computational genomics and RNA biology summer school | Berlin
Computational genomics and RNA biology summer school | BerlinComputational genomics and RNA biology summer school | Berlin
Computational genomics and RNA biology summer school | Berlin
Altuna Akalin
 
Epigenetic dysregulation talk at NGSAsia 2016
Epigenetic dysregulation talk at NGSAsia 2016 Epigenetic dysregulation talk at NGSAsia 2016
Epigenetic dysregulation talk at NGSAsia 2016
Altuna Akalin
 
BIMSB/MDC Bioinformatics Platform Overview
BIMSB/MDC Bioinformatics Platform OverviewBIMSB/MDC Bioinformatics Platform Overview
BIMSB/MDC Bioinformatics Platform Overview
Altuna Akalin
 
Bimsb bioinfo hackathon 2015
Bimsb bioinfo hackathon 2015Bimsb bioinfo hackathon 2015
Bimsb bioinfo hackathon 2015
Altuna Akalin
 
Analyzing RRBS methylation data
Analyzing RRBS methylation dataAnalyzing RRBS methylation data
Analyzing RRBS methylation data
Altuna Akalin
 

More from Altuna Akalin (8)

Deep learning-based cancer patient stratification
Deep learning-based cancer patient stratification Deep learning-based cancer patient stratification
Deep learning-based cancer patient stratification
 
Data analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomicsData analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomics
 
computational genomics course 2018 berlin
computational genomics course 2018 berlincomputational genomics course 2018 berlin
computational genomics course 2018 berlin
 
Computational genomics and RNA biology summer school | Berlin
Computational genomics and RNA biology summer school | BerlinComputational genomics and RNA biology summer school | Berlin
Computational genomics and RNA biology summer school | Berlin
 
Epigenetic dysregulation talk at NGSAsia 2016
Epigenetic dysregulation talk at NGSAsia 2016 Epigenetic dysregulation talk at NGSAsia 2016
Epigenetic dysregulation talk at NGSAsia 2016
 
BIMSB/MDC Bioinformatics Platform Overview
BIMSB/MDC Bioinformatics Platform OverviewBIMSB/MDC Bioinformatics Platform Overview
BIMSB/MDC Bioinformatics Platform Overview
 
Bimsb bioinfo hackathon 2015
Bimsb bioinfo hackathon 2015Bimsb bioinfo hackathon 2015
Bimsb bioinfo hackathon 2015
 
Analyzing RRBS methylation data
Analyzing RRBS methylation dataAnalyzing RRBS methylation data
Analyzing RRBS methylation data
 

Recently uploaded

Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
maazsz111
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 

Recently uploaded (20)

Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 

genomation: summary of genomic intervals

  • 1. genomation package genomation Altuna Akalın Usage and ubiquity of genomic interval summaries a toolkit to summarize, annotate and visualize genomic intervals Using genomation Altuna Akalın1 More information February 24, 2014 1* presented by. Package developed by Altuna Akalın and Vedran Franke
  • 2. Quick introduction genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation More information The genomation is an R package that expedites genomic interval summary and annotation. It has the following features 1 Annotation of genomic intervals: e.g. see what % of your intervals overlap with exon/intron/promoters
  • 3. Quick introduction genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation More information The genomation is an R package that expedites genomic interval summary and annotation. It has the following features 1 2 Annotation of genomic intervals: e.g. see what % of your intervals overlap with exon/intron/promoters Summary of genomic scores or read coverages over pre-defined regions e.g. extract the conservation profile over ChIP-seq binding sites (equi-width regions) or CpG islands (nonequi-width regions)
  • 4. Quick introduction genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation The genomation is an R package that expedites genomic interval summary and annotation. It has the following features 1 2 More information Annotation of genomic intervals: e.g. see what % of your intervals overlap with exon/intron/promoters Summary of genomic scores or read coverages over pre-defined regions e.g. extract the conservation profile over ChIP-seq binding sites (equi-width regions) or CpG islands (nonequi-width regions) 3 Visualize genomic interval summaries as meta-region plots or heatmaps.
  • 5. Quick introduction genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation The genomation is an R package that expedites genomic interval summary and annotation. It has the following features 1 2 More information Annotation of genomic intervals: e.g. see what % of your intervals overlap with exon/intron/promoters Summary of genomic scores or read coverages over pre-defined regions e.g. extract the conservation profile over ChIP-seq binding sites (equi-width regions) or CpG islands (nonequi-width regions) 3 4 Visualize genomic interval summaries as meta-region plots or heatmaps. Work with multiple file formats e.g. BAM, BED, bigWig, GFF and generic tabular text files containing chromosome location information.
  • 6. Quick introduction genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation The genomation is an R package that expedites genomic interval summary and annotation. It has the following features 1 2 More information Annotation of genomic intervals: e.g. see what % of your intervals overlap with exon/intron/promoters Summary of genomic scores or read coverages over pre-defined regions e.g. extract the conservation profile over ChIP-seq binding sites (equi-width regions) or CpG islands (nonequi-width regions) 3 4 Visualize genomic interval summaries as meta-region plots or heatmaps. Work with multiple file formats e.g. BAM, BED, bigWig, GFF and generic tabular text files containing chromosome location information. 5 do all these in R :)
  • 7. Genomic interval summaries are widely used genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation More information Summaries of genomic intervals are one of the useful ways to communicate high-dimensional data Traditionally, regions of interest are picked and distribution of genomic intervals are summarized on those regions
  • 8. Genomic interval summaries are widely used: Examples from literature genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation More information Figure : Erkek, S., et al. (2013). Molecular determinants of nucleosome retention at CpG-rich sequences in mouse spermatozoa. Nature Structural & Molecular Biology
  • 9. Genomic interval summaries are widely used: Examples from literature genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation More information Figure : Stadler, M., Murr, R., Burger, L., et al. (2011). DNA-binding factors shape the mouse methylome at distal regulatory regions. Nature
  • 10. Utility and futility of average profiles genomation package average profile around anchor 4.5 average score More information 5.0 Using genomation 4.0 Usage and ubiquity of genomic interval summaries Does this mean all of the windows (viewpoints) have a similar enrichment profile? 3.5 Altuna Akalın −100 0 bases 50 100
  • 11. Utility and futility of average profiles genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Only 1/3 of windows have such enrichment. Be careful when you are interpreting the average profiles. 21 Using genomation 0 5.2 10 16 More information −100 0 50 100
  • 12. Genomic interval summaries are widely used: Examples from literature genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation More information Figure : Lister, R., et al. (2009). Human DNA methylomes at base resolution show widespread epigenomic differences. Nature
  • 13. Genomic interval summaries are widely used: Examples from literature genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation More information Figure : Feng, S. et al. (2010). Conservation and divergence of methylation patterning in plants and animals. PNAS
  • 14. Issues to keep in mind when developing summary methods genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation More information Genomic data comes in many formats, we need a method that is able to work with multiple flat file formats We need a method that is not specialized on one type of data set such as read counts, it should also work on other scoring schemes(e.g. conservation scores) easily. Regions of interest are not always equi-width, you should be able to normalize for length differences by binning. Multiple visualization options and fast heatmap generation should be available Clustering of regions based on multiple summaries (e.g. binding for different TFs on the same set of regions) on the heatmap Ease of use, it should not take hours of coding to generate and visualize summaries.
  • 15. Overview of genomation features genomation package Altuna Akalın region 2 region 3 region 4 ... region m 1.1 1.0 0.8 0.86 TF4 TF3 0.6 0.6 TF2 0.34 TF1 0.072 . . . . . . . . . . . TF1 0 0.2 region 1 TF3 TF2 0.4 read per million 1 2 3 4 ... n 0 500 1000 base-pairs around anchor heatmaps for genomic interval sets TF 4 TF 3 TF 2 TF 1 Visualize 0 0.5 1 1.5 2 2.5 0 100 0 0 0 0 0.5 1 1.5 2 2.5 500 500 100 0 0 100 0 0 0.5 1 1.5 2 0 Annotate 500 Annotation BED GFF Tab txt GRanges 500 1000 base-pairs around anchor 500 More information Genomic Intervals BAM BigWig BED GFF Summarize Tab txt GRanges meta-region heatmaps TF4 100 Using genomation meta-region plots ScoreMatrix/ScoreMatrixList object Base-pairs/ bins 0.0 Usage and ubiquity of genomic interval summaries 0 0.5 1 1.5 2 2.5 Piecharts for annotation 25.7 21.8 11.6 40.9 Intergenic Intron Exon Promoter
  • 16. installation of the package and the example data genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation More information We can install the package and the data using install_github() function from the devtools package. #install dependencies install.packages( c("data.table","plyr","reshape2","ggplot2", "gridBase","devtools")) source("http://bioconductor.org/biocLite.R") biocLite(c("GenomicRanges","rtracklayer","impute","Rsamtools")) # install the packages library(devtools) install_github("genomation", username = "al2na") # install the data package # needed for examples install_github("genomationData", username = "al2na")
  • 17. Data import genomation package Altuna Akalın Various file formats can be used in genomation. You can read in annotation or your genomic intervals of interest. Usage and ubiquity of genomic interval summaries library(genomation) tab.file1 <- system.file("extdata/tab1.bed", package = "genomation") readGeneric(tab.file1) Using genomation More information ## GRanges with 6 ## seqnames ## <Rle> ## [1] chr21 ## [2] chr21 ## [3] chr21 ## [4] chr21 ## [5] chr21 ## [6] chr21 ## --## seqlengths: ## chr21 ## NA ranges and 0 metadata columns: ranges strand <IRanges> <Rle> [9437272, 9439473] * [9483485, 9484663] * [9647866, 9648116] * [9708935, 9709231] * [9825442, 9826296] * [9909011, 9909218] *
  • 18. Extraction of data over pre-defined genomic regions genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries ScoreMatrix() and ScoreMatrixBin() are functions used to extract data over predefined windows. ScoreMatrix is used when all of the windows have the same width (e.g. region around TSS) Using genomation ScoreMatrixBin is designed for use with windows of unequal width (e.g. enrichment of methylation over exons). More information data(cage) data(promoters) sm <- ScoreMatrix(target = cage, windows = promoters) sm ## scoreMatrix with dims: 1055 2001
  • 19. Visualizing ScoreMatrix: summary of genomic invervals over pre-defined regions genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation plotMeta(),heatMeta(), heatMatrix() and multiHeatMatrix() are the visualization functions. oldmar <- par()$mar par(oma = c(0, 0, 0, 0)) heatMatrix(sm, xcoords = c(-1000, 1000)) plotMeta(sm, xcoords = c(-1000, 1000),line.col="blue") par(oma = oldmar) 0.15 average score 0.10 0.05 0.00 0 0.75 1.5 2.2 3 0.20 0.25 More information −1000 −500 0 500 1000 −1000 −500 0 bases 500 1000
  • 20. Working with BAM files genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation More information BAM files can also be used in ScoreMatrix() and ScoreMatrixBin() functions bam.file = system.file('tests/test.bam', package='genomation') windows = GRanges(rep(c(1,2),each=2), IRanges(rep(c(1,2), times=2), width=5)) scores3 = ScoreMatrix(target=bam.file,windows=windows, type='bam')
  • 21. Working with bigWig files DHS around TSS 14 More information 8 10 Using genomation my.bed12.file=system.file("extdata/chr21.refseq.hg19.bed", package = "genomation") feats=readTranscriptFeatures(my.bed12.file,up.flank=500,down.flank=500) sm=ScoreMatrix(target="wgEncodeUwDnaseA549RawRep1.bw", windows=feats$promoters,type='bigWig',strand.aware=TRUE) plotMeta(sm,xcoords=c(-500,500),main="DHS around TSS",line.col="blue") average score Usage and ubiquity of genomic interval summaries 6 Altuna Akalın ScoreMatrix() and ScoreMatrixBin() are functions can handle bigWig files. Here we use ENCODE DHS scores, downloaded from http://goo.gl/fEVu0g 4 genomation package −400 0 200
  • 22. Multiple profiles 0 0 0 50 25 0 25 −2 50 50 −5 0 00 0 0 25 −2 50 50 −5 0 00 0 0 25 0 P300 Suz12 Rad21 Znf143 −2 50 0 0 −5 00 CTCF 50 −5 0 00 More information 25 Using genomation ctcf.peaks=readRDS("ctcf.peaks.rds") dataPath = system.file("extdata", package = "genomationData") bam.files = list.files(dataPath, full= T,pattern = "bam$")[c(1:4,6)] sml = ScoreMatrixList(bam.files, ctcf.peaks, bin.num = 50,type = "bam") names(sml)=c("CTCF","P300","Suz12","Rad21","Znf143") multiHeatMatrix(sml, xcoords = c(-500, 500),cex.axis=0.35,common.scale = T, col = c("lightgray", "blue"),winsorize=c(0,95)) −2 50 Usage and ubiquity of genomic interval summaries 50 −5 0 00 Altuna Akalın Multiple heatmap profiles can be plotted using multiHeatMatrix() which takes in a ScoreMatrixList object. Here we used CTCF , P300 , Suz12 ,Rad21, Znf143 BAM files from genomationData package. −2 50 genomation package 02468 02468 02468 02468 02468
  • 23. Multiple profiles genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation multiHeatMatrix() can also apply K-means clustering. Extreme values are trimmed using with “winsorize” argument multiHeatMatrix(sml, xcoords = c(-500, 500),kmeans=TRUE,k=3,common.scale = T, cex.axis=0.4,col = c("lightgray", "blue"),winsorize=c(0,95)) CTCF P300 Suz12 Rad21 Znf143 1 More information 2 0 0 0 50 25 0 25 0 −500 50 0 −2 50 0 25 0 −500 50 0 −2 50 0 25 0 −500 50 0 −2 50 0 50 0 −500 50 0 −2 50 25 −2 −5 00 3 02468 02468 02468 02468 02468
  • 24. Multiple profiles genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation Multiple average profiles can be visualized with heatMeta(). Here, we also apply a scaling function to all the matrices. # take log2 of all matrices sml2=scaleScoreMatrixList(sml,scalefun=function(x) log2(x+1)) heatMeta(sml2,legend.name="average profiles",xcoords=c(-500, 500), xlab="bp around peaks") More information 1.8 CTCF average profiles 0.61 1 1.4 P300 Suz12 0.21 Rad21 Znf143 −400 −200 0 bp around peaks 200 400
  • 25. Multiple profiles genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Multiple average profiles can also be visualized with plotMeta() plotMeta(sml2,profile.names=names(sml2), xcoords=c(-500, 500), main="mult. profiles") Using genomation mult. profiles More information 1.0 0.5 average score 1.5 CTCF P300 Suz12 Rad21 Znf143 −400 −200 0 bases 200 400
  • 26. Future work... genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation More information Explore overlap statistics between two genomic data sets: Does TF1 binding site locations overlap with TF2 sites more than expected? This is previously explored with GenometriCorr package. These functionality can be included in the form of a dependency. Performance improvement on certain functions, faster is always better...
  • 27. Further information genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation More information The genomation package is available at http:/al2na.github.io/genomation. You can find the link to the vignette on the webpage as well. Code that generated this presentation is available at http://github.com/al2na/genomation_presentation Questions and bug reports You can view/open issues in github https://github.com/al2na/genomation/issues?state=open You can ask questions by sending an e-mail to genomation@googlegroups.com or using the web interface to google groups Developed by Altuna Akalın and Vedran Franke
  • 28. Session Info genomation package Altuna Akalın Usage and ubiquity of genomic interval summaries Using genomation More information sessionInfo() ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## R version 3.0.2 (2013-09-25) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] C attached base packages: [1] methods grid stats [8] base graphics grDevices utils other attached packages: [1] genomation_0.99.0.2 knitr_1.5 loaded via a namespace (and not attached): [1] BSgenome_1.30.0 BiocGenerics_0.8.0 [4] GenomicRanges_1.14.3 IRanges_1.20.5 [7] RColorBrewer_1.0-5 RCurl_1.95-4.1 [10] XML_3.95-0.2 XVector_0.2.0 [13] colorspace_1.2-4 data.table_1.8.10 [16] digest_0.6.3 evaluate_0.5.1 [19] ggplot2_0.9.3.1 gridBase_0.4-6 Biostrings_2.30.0 MASS_7.3-29 Rsamtools_1.14.1 bitops_1.0-6 dichromat_2.0-0 formatR_0.10 gtable_0.1.2 datas