Your SlideShare is downloading. ×
0
A Grammar of Graphics for Genomics                                 The ggbio Package                                  Mich...
Outline1   Motivation2   High-level Plots3   Grammar Components    Michael Lawrence (Genentech)   A Grammar of Graphics fo...
Outline1   Motivation2   High-level Plots3   Grammar Components    Michael Lawrence (Genentech)   A Grammar of Graphics fo...
Data on the Genome  • Comes in two flavors:      • Annotations (genes, TF binding sites, ...)      • Experimental measureme...
Data on the Genome  • Comes in two flavors:      • Annotations (genes, TF binding sites, ...)      • Experimental measureme...
Data on the Genome  • Comes in two flavors:      • Annotations (genes, TF binding sites, ...)      • Experimental measureme...
Data on the Genome  • Comes in two flavors:      • Annotations (genes, TF binding sites, ...)      • Experimental measureme...
Data on the Genome  • Comes in two flavors:      • Annotations (genes, TF binding sites, ...)      • Experimental measureme...
ChallengesBig data, wide spaces   • Need summaries that are efficiently computed, communicate more      with less and expose...
ChallengesBig data, wide spaces                    120.928 Mb   120.93 Mb    120.932 Mb    120.934 Mb     120.936 Mb   120...
ChallengesBig data, wide spaces              60              50              40              30              20           ...
ChallengesBig data, wide spaces              300000              250000              200000              150000           ...
Existing ToolsUCSC                      IGB               IGV                      Circos                 GViz Michael Law...
Existing ToolsUCSC                      IGB               IGV                      Circos                 GViz Michael Law...
Existing ToolsUCSC                      IGB               IGV                      Circos                 GViz Michael Law...
Existing ToolsUCSC                      IGB               IGV                      Circos                 GViz Michael Law...
Existing ToolsUCSC                      IGB               IGV                      Circos                 GViz Michael Law...
Existing ToolsUCSC                      IGB               IGV                      Circos                 GViz Michael Law...
Existing ToolsUCSC                      IGB               IGV                      Circos                 GVizLimitations ...
Grammars of Graphics    • A grammar of graphics is a       language for expressing plots    • Graphics are constructed    ...
Grammars of Graphics    • A grammar of graphics is a       language for expressing plots    • Graphics are constructed    ...
Grammars of Graphics    • A grammar of graphics is a       language for expressing plots    • Graphics are constructed    ...
Grammars of Graphics    • A grammar of graphics is a       language for expressing plots    • Graphics are constructed    ...
The ggbio Package  • An R/Bioconductor package that extends the Wilkinson/Wickham     grammar for applications in genomics...
Outline1   Motivation2   High-level Plots3   Grammar Components    Michael Lawrence (Genentech)   A Grammar of Graphics fo...
Basic PlotsGene Structures                 Read Alignments                    Sequence             Multiple Michael Lawren...
Basic PlotsGene Structures                    Read Alignments                         Sequence                     Multipl...
Basic PlotsGene Structures                     Read Alignments                           Sequence                     Mult...
Basic PlotsGene Structures                 Read Alignments                      Sequence                      Multiple    ...
Basic PlotsGene Structures                    Read Alignments                       Sequence                Multiple      ...
Basic PlotsGene Structures                    Read Alignments                         Sequence                    Multiple...
Overview PlotsGrand Linear                               Karyogram                           Circular Michael Lawrence (Ge...
Overview PlotsGrand Linear                               Karyogram                           Circular Michael Lawrence (Ge...
Overview PlotsGrand Linear                                     Karyogram                                                  ...
Overview PlotsGrand Linear                                                                                                ...
Specialized PlotsMismatch summary + VCF                                               Edge-linked Intervals Michael Lawren...
Specialized PlotsMismatch summary + VCF                                                                      Edge-linked I...
Specialized PlotsMismatch summary + VCF                                                           Edge-linked Intervals   ...
Outline1   Motivation2   High-level Plots3   Grammar Components    Michael Lawrence (Genentech)   A Grammar of Graphics fo...
The Wilkinson/Wickham Grammar of Graphics        Geom       The shape used for drawing the data          Stat     Transfor...
The Wilkinson/Wickham Grammar of Graphics        Geom       The shape used for drawing the data          Stat     Transfor...
The Wilkinson/Wickham Grammar of Graphics        Geom       The shape used for drawing the data          Stat     Transfor...
The Wilkinson/Wickham Grammar of Graphics        Geom       The shape used for drawing the data          Stat     Transfor...
The Wilkinson/Wickham Grammar of Graphics        Geom       The shape used for drawing the data          Stat     Transfor...
The Wilkinson/Wickham Grammar of Graphics        Geom       The shape used for drawing the data          Stat     Transfor...
A Grammar of Graphics for GenomicsExtensions are marked in red                                     geometric object:      ...
Components of the Genomic Grammar  Geom:         alignment chevron arch arrow arrowrect   Stat:        gene reduce steppin...
Components of the Genomic Grammar  Geom:          alignment chevron arch arrow arrowrect   Stat:         gene reduce stepp...
Components of the Genomic Grammar  Geom:          alignment chevron arch arrow arrowrect   Stat:         gene reduce stepp...
Components of the Genomic Grammar  Geom:          alignment chevron arch arrow arrowrect   Stat:         gene reduce stepp...
Components of the Genomic Grammar  Geom:          alignment chevron arch arrow arrowrect   Stat:         gene reduce stepp...
Components of the Genomic Grammar  Geom:          alignment chevron arch arrow arrowrect   Stat:         gene reduce stepp...
Components of the Genomic Grammar  Geom:             alignment chevron arch arrow arrowrect   Stat:            gene reduce...
Components of the Genomic Grammar  Geom:           alignment chevron arch arrow arrowrect   Stat:          gene reduce ste...
Components of the Genomic Grammar  Geom:                 alignment chevron arch arrow arrowrect   Stat:                gen...
Components of the Genomic Grammar  Geom:                alignment chevron arch arrow arrowrect   Stat:               gene ...
Components of the Genomic Grammar  Geom:                alignment chevron arch arrow arrowrect   Stat:               gene ...
Components of the Genomic Grammar  Geom:         alignment chevron arch arrow arrowrect   Stat:        gene reduce steppin...
Components of the Genomic Grammar  Geom:         alignment chevron arch arrow arrowrect   Stat:        gene reduce steppin...
Components of the Genomic Grammar  Geom:                  alignment chevron arch arrow arrowrect   Stat:                 g...
Components of the Genomic Grammar  Geom:                  alignment chevron arch arrow arrowrect   Stat:                 g...
Components of the Genomic Grammar  Geom:                    alignment chevron arch arrow arrowrect   Stat:                ...
Components of the Genomic Grammar  Geom:               alignment chevron arch arrow arrowrect   Stat:              gene re...
Components of the Genomic Grammar  Geom:                      alignment chevron arch arrow arrowrect   Stat:              ...
Summary  • The ggbio package is a toolkit for plotting genomic data and     annotations  • Available as part of the Biocon...
AcknowledgementsTengfei YinDi CookRobert Gentleman Michael Lawrence (Genentech)   A Grammar of Graphics for Genomics   Aug...
Upcoming SlideShare
Loading in...5
×

Michael 2011 japan

2,033

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,033
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Michael 2011 japan"

  1. 1. A Grammar of Graphics for Genomics The ggbio Package Michael Lawrence Genentech August 29, 2012Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 1 / 18
  2. 2. Outline1 Motivation2 High-level Plots3 Grammar Components Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 2 / 18
  3. 3. Outline1 Motivation2 High-level Plots3 Grammar Components Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 3 / 18
  4. 4. Data on the Genome • Comes in two flavors: • Annotations (genes, TF binding sites, ...) • Experimental measurements (sequence reads) • Both types are tied to genomic coordinates, providing a common axis that permits cross-dataset comparison and inference • Typically stored as a table, with the range as a fundamental variable type, plus metadata Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 4 / 18
  5. 5. Data on the Genome • Comes in two flavors: • Annotations (genes, TF binding sites, ...) • Experimental measurements (sequence reads) • Both types are tied to genomic coordinates, providing a common axis that permits cross-dataset comparison and inference • Typically stored as a table, with the range as a fundamental variable type, plus metadata 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 4 / 18
  6. 6. Data on the Genome • Comes in two flavors: • Annotations (genes, TF binding sites, ...) • Experimental measurements (sequence reads) • Both types are tied to genomic coordinates, providing a common axis that permits cross-dataset comparison and inference • Typically stored as a table, with the range as a fundamental variable type, plus metadata 60 50 40 30 20 10 0 120928000 120930000 120932000 120934000 120936000 120938000 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 4 / 18
  7. 7. Data on the Genome • Comes in two flavors: • Annotations (genes, TF binding sites, ...) • Experimental measurements (sequence reads) • Both types are tied to genomic coordinates, providing a common axis that permits cross-dataset comparison and inference • Typically stored as a table, with the range as a fundamental variable type, plus metadata 60 50 40 30 20 10 0 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 4 / 18
  8. 8. Data on the Genome • Comes in two flavors: • Annotations (genes, TF binding sites, ...) • Experimental measurements (sequence reads) • Both types are tied to genomic coordinates, providing a common axis that permits cross-dataset comparison and inference • Typically stored as a table, with the range as a fundamental variable type, plus metadata seqnames start end strand exon id tx id 10 120927215 120928045 - 129230 14886,14887 10 120928689 120928854 - 129229 14886,14887 10 120931894 120931997 - 129228 14886,14887 10 120933249 120933384 - 129227 14886,14887 10 120933963 120934069 - 129226 14886 10 120933963 120934104 - 119757 14887 10 120936533 120936665 - 119756 14887 10 120936552 120936665 - 129225 14886 10 120938267 120938345 - 129224 14886,14887 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 4 / 18
  9. 9. ChallengesBig data, wide spaces • Need summaries that are efficiently computed, communicate more with less and expose the most interesting aspects of the data • Need different ways of viewing the data, depending on the density and scale, from whole genome to single basepair Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 5 / 18
  10. 10. ChallengesBig data, wide spaces 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb • Need summaries that are efficiently computed, communicate more with less and expose the most interesting aspects of the data • Need different ways of viewing the data, depending on the density and scale, from whole genome to single basepair Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 5 / 18
  11. 11. ChallengesBig data, wide spaces 60 50 40 30 20 10 0 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb • Need summaries that are efficiently computed, communicate more with less and expose the most interesting aspects of the data • Need different ways of viewing the data, depending on the density and scale, from whole genome to single basepair Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 5 / 18
  12. 12. ChallengesBig data, wide spaces 300000 250000 200000 150000 100000 50000 0 0 Mb 50 Mb 100 Mb • Need summaries that are efficiently computed, communicate more with less and expose the most interesting aspects of the data • Need different ways of viewing the data, depending on the density and scale, from whole genome to single basepair Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 5 / 18
  13. 13. Existing ToolsUCSC IGB IGV Circos GViz Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
  14. 14. Existing ToolsUCSC IGB IGV Circos GViz Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
  15. 15. Existing ToolsUCSC IGB IGV Circos GViz Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
  16. 16. Existing ToolsUCSC IGB IGV Circos GViz Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
  17. 17. Existing ToolsUCSC IGB IGV Circos GViz Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
  18. 18. Existing ToolsUCSC IGB IGV Circos GViz Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
  19. 19. Existing ToolsUCSC IGB IGV Circos GVizLimitations • Limited to one type of view (linear or circular) • Not tightly integrated with an analysis environment through standard, abstract data structures (except GViz) • No low-level toolkit for prototyping new types of graphics Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
  20. 20. Grammars of Graphics • A grammar of graphics is a language for expressing plots • Graphics are constructed through the combination of various types of primitives; like legos for graphics • The most prominent grammar was introduced by Wilkinson’s book The Grammar of Graphics • Wilkinson’s grammar was extended by Wickham and the ggplot2 package Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 7 / 18
  21. 21. Grammars of Graphics • A grammar of graphics is a language for expressing plots • Graphics are constructed through the combination of various types of primitives; like legos for graphics • The most prominent grammar was introduced by Wilkinson’s book The Grammar of Graphics • Wilkinson’s grammar was extended by Wickham and the ggplot2 package Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 7 / 18
  22. 22. Grammars of Graphics • A grammar of graphics is a language for expressing plots • Graphics are constructed through the combination of various types of primitives; like legos for graphics • The most prominent grammar was introduced by Wilkinson’s book The Grammar of Graphics • Wilkinson’s grammar was extended by Wickham and the ggplot2 package Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 7 / 18
  23. 23. Grammars of Graphics • A grammar of graphics is a language for expressing plots • Graphics are constructed through the combination of various types of primitives; like legos for graphics • The most prominent grammar was introduced by Wilkinson’s book The Grammar of Graphics • Wilkinson’s grammar was extended by Wickham and the ggplot2 package Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 7 / 18
  24. 24. The ggbio Package • An R/Bioconductor package that extends the Wilkinson/Wickham grammar for applications in genomics • Integrated with Bioconductor • Operates on standard, abstract genomic data structures • Leverages efficient range-based algorithms • Programming interface has two levels of abstraction: autoplot Maps Bioconductor data structures to plots grammar Mix and match to create custom plots Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 8 / 18
  25. 25. Outline1 Motivation2 High-level Plots3 Grammar Components Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 9 / 18
  26. 26. Basic PlotsGene Structures Read Alignments Sequence Multiple Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 10 / 18
  27. 27. Basic PlotsGene Structures Read Alignments Sequence Multiple 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 10 / 18
  28. 28. Basic PlotsGene Structures Read Alignments Sequence Multiple 60 50 40 30 20 10 0 120928000 120930000 120932000 120934000 120936000 120938000 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 10 / 18
  29. 29. Basic PlotsGene Structures Read Alignments Sequence Multiple A C G T 120928700 120928750 120928800 120928850 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 10 / 18
  30. 30. Basic PlotsGene Structures Read Alignments Sequence Multiple CGTAGGAGAATCCGGTGTCCAGTTCGCTGGGCAGACTTCTCCATGTGTTT 120928690 120928700 120928710 120928720 120928730 120928740 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 10 / 18
  31. 31. Basic PlotsGene Structures Read Alignments Sequence Multiple 60 50 40 30 20 10 0 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 10 / 18
  32. 32. Overview PlotsGrand Linear Karyogram Circular Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 11 / 18
  33. 33. Overview PlotsGrand Linear Karyogram Circular Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 11 / 18
  34. 34. Overview PlotsGrand Linear Karyogram Circular 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X seqReg Exon Intron Other 5.0e+07 1.0e+08 1.5e+08 2.0e+08 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 11 / 18
  35. 35. Overview PlotsGrand Linear Karyogram Circular 21 22 1 20 50M 50M 0M 0M 100M 0M 150M 50M 19 200M 0M M 50 0M 0M 18 2 M 50 M 50 0M 10 0M 0M 15 q 17 0M 20 M 50 q q 0M q 0M q 16 M 50M 50 3 q 0M 100M q 100M q 150M rearrangements 15 50M q q interchromosomal q q q q 0M intrachromosomal 0M 50M 100M tumreads 4 q 4 14 q 50M 100M 0M 150M q 6 q q 8 q 100M 0M q 10 13 q q 12 50M 50M 0M q 100M 5 q q 15 0M 10 0M 50 q 12 M 0M 0M 50 M q 10 0M 10 6 M 0 15 50 0M M 11 0M 0M 50 10 M 100M 0M 50M 150M 7 0M 0M 10 50M 100M 100M 50M 0M 9 8 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 11 / 18
  36. 36. Specialized PlotsMismatch summary + VCF Edge-linked Intervals Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 12 / 18
  37. 37. Specialized PlotsMismatch summary + VCF Edge-linked Intervals 15 10 read mismatch A Counts C G T 5 0 snp T referenceA T T A A G A A A G T A C C G T G T G A C A T C A C A G G C T G G G A G C T T G A G A G 25235720 25235725 25235730 25235735 25235740 25235745 25235750 25235755 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 12 / 18
  38. 38. Specialized PlotsMismatch summary + VCF Edge-linked Intervals 1000 800 Expression group 600 GM12878 400 K562 200 0 uc002raw.2 uc010yjh.1 uc002rav.2 uc010yjg.1 uc002rau.2 10930000 10940000 10950000 10960000 10970000 10980000 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 12 / 18
  39. 39. Outline1 Motivation2 High-level Plots3 Grammar Components Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 13 / 18
  40. 40. The Wilkinson/Wickham Grammar of Graphics Geom The shape used for drawing the data Stat Transforms the data before plotting Scale Maps data to geom aesthetics, guides like legends and axes Coord Maps from geom space to device space Facet Small multiples of data subsets (trellis) Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 14 / 18
  41. 41. The Wilkinson/Wickham Grammar of Graphics Geom The shape used for drawing the data Stat Transforms the data before plotting Scale Maps data to geom aesthetics, guides like legends and axes Coord Maps from geom space to device space Facet Small multiples of data subsets (trellis) Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 14 / 18
  42. 42. The Wilkinson/Wickham Grammar of Graphics Geom The shape used for drawing the data Stat Transforms the data before plotting Scale Maps data to geom aesthetics, guides like legends and axes Coord Maps from geom space to device space Facet Small multiples of data subsets (trellis) Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 14 / 18
  43. 43. The Wilkinson/Wickham Grammar of Graphics Geom The shape used for drawing the data Stat Transforms the data before plotting Scale Maps data to geom aesthetics, guides like legends and axes Coord Maps from geom space to device space Facet Small multiples of data subsets (trellis) Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 14 / 18
  44. 44. The Wilkinson/Wickham Grammar of Graphics Geom The shape used for drawing the data Stat Transforms the data before plotting Scale Maps data to geom aesthetics, guides like legends and axes Coord Maps from geom space to device space Facet Small multiples of data subsets (trellis) Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 14 / 18
  45. 45. The Wilkinson/Wickham Grammar of Graphics Geom The shape used for drawing the data Stat Transforms the data before plotting Scale Maps data to geom aesthetics, guides like legends and axes Coord Maps from geom space to device space Facet Small multiples of data subsets (trellis) Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 14 / 18
  46. 46. A Grammar of Graphics for GenomicsExtensions are marked in red geometric object: Y scale: discrete chevron geometric object: rect from stepping color scale: discrete 2 from strand strandstatistical transformation: +stepping − geometric object: 1 alignment 48.245 Mb 48.260 Mb 48.265 Mb 48.270 Mb X scale: sequence 48.250 Mb 48.255 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 15 / 18
  47. 47. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  48. 48. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet NM_014098(GeneID:10935) NM_006793(GeneID:10935) 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  49. 49. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet NM_014098(GeneID:10935) NM_006793(GeneID:10935) 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  50. 50. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet NM_014098(GeneID:10935) NM_006793(GeneID:10935) 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  51. 51. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet NM_014098(GeneID:10935) NM_006793(GeneID:10935) 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  52. 52. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet NM_014098(GeneID:10935) NM_006793(GeneID:10935) 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  53. 53. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet original reduced 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  54. 54. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet stepping 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  55. 55. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet 60 50 40 Coverage 30 20 10 0 120928000 120930000 120932000 120934000 120936000 120938000 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  56. 56. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet 60 50 40 A Counts 30 C G T 20 10 0 120928000 120930000 120932000 120934000 120936000 120938000 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  57. 57. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet 60 50 40 A Counts C 30 G 20 T 10 0 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  58. 58. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet original truncated Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  59. 59. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet 2000 1500 normal 1000 500 0 score 2000 500 1000 1500 1500 novel tumor 1000 FALSE TRUE 500 0 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  60. 60. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet 10 11 6e+05 Coverage 4e+05 2e+05 0e+00 0e+00 5e+07 1e+08 0e+00 5e+07 1e+08 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  61. 61. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet 6e+05 Coverage 4e+05 2e+05 0e+00 10 11 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  62. 62. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet 200 150 value Features 5000 100 0 −5000 50 1 2 3 4 5 6 Samples Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  63. 63. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet chr10 chr10 chr10 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  64. 64. Components of the Genomic Grammar Geom: alignment chevron arch arrow arrowrect Stat: gene reduce stepping coverage mismatch table Scale: sequence genome fold-change giemsa Coord: truncate-gaps Layout: tracks range-facet chr10 chr10 chr10 60 50 40 A Counts C 30 G 20 T 10 0 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
  65. 65. Summary • The ggbio package is a toolkit for plotting genomic data and annotations • Available as part of the Bioconductor project • Easy to use and flexible enough to handle the diverse use cases encountered in genomics • Useful plots are automatically generated from Bioconductor data structures using reasonable defaults • New types of plots can be constructed from grammar primitives specially designed for genomics Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 17 / 18
  66. 66. AcknowledgementsTengfei YinDi CookRobert Gentleman Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 18 / 18
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×