Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Summarize, annotate and visualize genomic intervals 
with R/BioC package genomation
	
Genomic	intervals		
Genomic	interval...
Upcoming SlideShare
Loading in …5
×

Poster/cheatsheet for R/BioC package genomation [Download for a better resolution]

7,662 views

Published on

[ Download for a better resolution ] genomation is a package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq experiments, TF binding sites, methylation scores, etc. The package can use any tabular genomic feature data as long as it has minimal information on the locations of genomic intervals. In addition, It can use BAM or BigWig files as input.

Published in: Science
  • Be the first to comment

Poster/cheatsheet for R/BioC package genomation [Download for a better resolution]

  1. 1. Summarize, annotate and visualize genomic intervals with R/BioC package genomation Genomic intervals Genomic intervals are the basis of genome annota3on. Intervals can contain simply loca3on informa3on (TFBS loca3ons) or can contain a variety of scores, with different scales, such as percent methyla3on, ChIP-seq enrichment or read coverage. Windows/regions of interest over the genome Summary of genomic intervals over different sub-genomic contexts or windows of interest is the necessary first step in making inference about the biological importance of the data. The can be of equal length (Ex: pre-defined regions around TSS) or not (Ex: exons, transcript or CpG islands) Extract genomic intervals for windows of interest Genomic intervals could be extracted for windows of interest and stored in matrix format. If the windows are not of equal length, then binning strategies can be used to create equal number of bins over those windows and s3ll use a matrix to store informa3on. Visualize the summary of genomic intervals The matrices that described above can be visualized using heatmaps or meta-region plots. Meta-region plots show the average value of the signal from genomic intervals over the windows of interest, this could be a line plot or a heatmap where colors indicate average values. Another type heatmap could be used to show the values for matrices that contain the signal value for genomic intervals over the windows. In these heatmaps, every row represents a window and every column is a base-posi3on or a bin. Read genomic intervals into R Genomic intervals of any kind can be red into R using the func3ons below, all which return GRanges or GRangesList objects: readGeneric(file,…)can read generic text files with genomic interval informa3on. readBed(file,...)can read BED files. readTranscriptFeatures(file,…) can read BED12 files with exon/intron structure. gffToGRanges can read a GFF file. readBroadPeak,readNarrowPeak, readFeatureFlank are other convenience func3ons to read BED-like files. Extract genomic intervals for windows of interest in R Genomic intervals could be extracted for windows of interest using ScoreMatrix(target,windows) & ScoreMatrixBin(target,windows) func3ons. These func3ons can handle BAM, BigWig files and GRanges objects as inputs. patternMatrix() returns the rela3ve loca3ons or scores of k-mers or mo3fs, useful for analyzing ChIP-seq. All of these return ScoreMatrix or ScoreMatrixList objects. These objects can be manipulated further using orderBy, binMatrix, scaleScoreMatrix, scaleScoreMatrixList, intersectScoreMatrixList and [] func3ons. Visualize the summary of genomic intervals in R heatMatrix(scoreMatrixobj) and multiHeatMatrix(scoreMatrixListObj ) makes the heatmaps for windows of interest. plotMeta() makes meta-region plots and heatMeta() makes meta-region heatmaps. These func3ons return values invisibly see respec3ve help pages for func3ons. Intergenic Intron Exon Promoter40.9 11.6 21.825.7 0 500 1000 0.00.20.40.60.81.0 base-pairs around anchor readpermillion TF4 TF3 TF2 TF1 0 500 1000 0 0.5 1 1.5 2 TF 4 0 500 1000 0 0.5 1 1.5 2 2.5 TF 3 0 500 1000 0 0.5 1 1.5 2 2.5 TF 2 0 500 1000 0 0.5 1 1.5 2 2.5 TF 1 0 500 1000 base-pairs around anchor TF1 TF2 TF3 TF4 0.0720.340.60.861.1 meta-region plots meta-region heatmaps heatmaps for genomic interval sets Piecharts for annotation Annota3on for genomic intervals in R Annota3on summaries for target genomic intervals can be obtained by annotateWithFeatures(). The resul3ng object can be visualized by plotTargetAnnotation() for piechart and heatTargetAnnotation() for a heatmap of annota3on overlapping percentages. Contributors: Altuna Akalin [aut, cre], Vedran Franke [aut, cre], Katarzyna Wreczycka [aut], Alexander Gosdschan [ctb], Liz Ing-Simmons [ctb] CitaCon: Akalin A, Franke V, Vlahovicek K, Mason CE, Schubeler D. (2015). Bioinforma3cs. Doi: 10.1093/bioinforma3cs/btu775 Genomic intervals with different kinds of informa3on Extract subset of genomic intervals for windows of interest Meta-region line plots Meta-region heatmap heatmaps for genomic intervals Annota3on pie charts Annota3on heatmaps Annota3on for genomic intervals Genomic intervals such as ChIP-seq peaks or differen3ally methylated regions needs to be annotated further with known genomic annota3ons such as promoter, intron, exon structures of the genes. This gives addi3onal informa3on for func3onal characteriza3on of the genomic intervals. This usually represented as percent of genomic features overlapping with the annota3on. urls: hap://bioinforma3cs.mdc-berlin.de/genoma3on hap://www.biconductor.org/release/genoma3on Install genomaCon: OR devtools::install_github(”BIMSBbioinfo/genomation”, build_vig=F) source("https://bioconductor.org/biocLite.R") ; biocLite(“genomation”) Visualize summaries and annota3on for genomic intervals Misc. func3ons getRandomEnrichment() calculates the significance of associa3on between two sets of genomics intervals. Valid for package version => 1.5.6

×