Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Visualization of 3D Genome Data

621 views

Published on

VIZBI 2018 talk on visualization of 3D Genome Data.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Visualization of 3D Genome Data

  1. 1. Nils Gehlenborg, PhD http://gehlenborglab.org Visualization of 3D Genome Data HARVARD MEDICAL SCHOOL DEPARTMENT OF BIOMEDICAL INFORMATICS @ngehlenborg
  2. 2. DNA in the Nucleus
  3. 3. https://upload.wikimedia.org/wikipedia/commons/7/7a/Basketball.png
  4. 4. https://upload.wikimedia.org/wikipedia/commons/7/7a/Basketball.png http://simplemaps.com/resources/svg-us
  5. 5. Dekker et al., Nature, 2017
  6. 6. Why is 3D Genome Data interesting?
  7. 7. Role of 3D DNA Structure & Dynamics Cell Division Gene Regulation Structural Variation Dekker et al., Nature, 2017
  8. 8. Cell Division Gene Regulation Enhancers: spatial proximity to control gene expression Clustering of chromatin near lamina: gene silencing GWAS: many variants found in non-coding regions Structural Variation Dekker et al., Nature, 2017
  9. 9. Cell Division Gene Regulation Structural Variation Spatial arrangement influences structural variation Dekker et al., Nature, 2017
  10. 10. Role of 3D DNA Structure & Dynamics Cell Division Gene Regulation Structural Variation Dekker et al., Nature, 2017
  11. 11. How do we measure chromosomal conformation?
  12. 12. De Wit and De Laat, Genes & Development, 2012 Chromosome Conformation Capture
  13. 13. De Wit and De Laat, Genes & Development, 2012
  14. 14. De Wit and De Laat, Genes & Development, 2012
  15. 15. Rao et al., Cell, 2014 Hi-C Protocol
  16. 16. Genome-wide Contact Matrix
  17. 17. How big is a Hi-C Interaction Matrix?
  18. 18. 3,000,000 x 3,000,000 pixels Printed at 300 DPI ~250 x 250 meters ~830 x 830 feet How big is a Hi-C Interaction Matrix? Typical resolution today: Reads mapped into 1,000 bp bins → ~3,000,000 x 3,000,000 matrix for a whole genome Printed at 300 DPI ~250 x 250 meters ~830 x 830 feet
  19. 19. 3,000,000 x 3,000,000 pixels Printed at 300 DPI ~250 x 250 meters ~830 x 830 feet By Sam valadi - https://www.flickr.com/photos/132084522@N05/17178926219/in/photostream/
  20. 20. 3,000,000 x 3,000,000 pixels Printed at 300 DPI ~250 x 250 meters ~830 x 830 feet We need a bigger screen!
  21. 21. http://higlass.io/app/?config=dyE970c4TH21onnRvT1PmQ
  22. 22. What are some visualization challenges involved?
  23. 23. 1. View interactions at different scales
 From genome to individual bins 2. Compare interactions across many conditions
 Two or more conditions 3. View and compare features
 Within and across maps 4. Navigating an enormous data space
 With few well known landmarks 5. Do all of this in a web browser
 Interaction and low latency
  24. 24. How can we visualize the data?
  25. 25. Network Visualization 1. (3M x 3M)/2 interactions 2. Weight for each interaction 3. Constraint: nodes in sequence order
  26. 26. MatrixNode-Link Diagram Network Visualization 1. (3M x 3M)/2 interactions 2. Weight for each interaction 3. Constraint: nodes in sequence order
  27. 27. Genome Interaction Data Visualization Scale 1. Global Interactions (whole chromosome or genome) 2. Local Interactions (immediate feature neighborhood) 3. Individual Features Encoding 1. Heatmap 2. Node-Link Diagram (here: Arc Diagram) 3. 3D
  28. 28. Genome Interaction Data Visualization Scale 1. Global Interactions (whole chromosome or genome) 2. Local Interactions (immediate feature neighborhood) 3. Individual Features Encoding 1. Heatmap 2. Node-Link Diagram (here: Arc Diagram) 3. 3D Illustration of concepts and models!
  29. 29. Genome Interaction Data Visualization
  30. 30. Reviewed in Yardımcı & Noble, Genome Biology, 2017, http://aidenlab.org/juicebox/ Global Interactions Juicebox HEATMAP Caveat
 only qualitative interpretation of color map possible
  31. 31. Wong 2010, Nature Methods & https://en.wikipedia.org/w/index.php?curid=45522095 Mini Excursion: Color Color is a relative medium!
  32. 32. Reviewed in Yardımcı & Noble, Genome Biology, 2017, http://aidenlab.org/juicebox/ Global Interactions Juicebox HEATMAP Caveat
 only qualitative interpretation of color map possible
  33. 33. DOI 10.1101/121889, http://higlass.io, Kerpedjiev, Abdennur, Lekschas …, Mirny, Park, Gehlenborg Global Interactions HiGlass HEATMAP Caveat
 only qualitative interpretation of color map possible
  34. 34. Reviewed in Yardımcı & Noble, Genome Biology, 2017, http://epigenomegateway.wustl.edu/ Global Interactions Washington University Epigenome Browser ARC DIAGRAM Caveats
 line crossings, limited dynamic range zooming complex
  35. 35. http://rondo.ws, O’Donoghue Lab Global Interactions Rondo ARC DIAGRAM Caveats
 line crossings, limited dynamic range
  36. 36. http://rondo.ws, O’Donoghue Lab Global Interactions Rondo ARC DIAGRAM Caveat
 line crossings, colors hard to map
  37. 37. http://promoter.bx.psu.edu/hi-c/, Reviewed in Yardımcı & Noble, Genome Biology, 2017 Local Interactions 3D Genome Browser HEATSTRIP Caveat
 height of triangle grows with distance of interaction
  38. 38. Reviewed in Yardımcı & Noble, Genome Biology, 2017, http://epigenomegateway.wustl.edu/ Local Interactions Washington University Epigenome Browser ARC DIAGRAM Caveats
 zooming is problematic, no context
  39. 39. Interaction & Navigation http://higlass.io
  40. 40. http://higlass.io/app/?config=TKXaqsSIRvGEcw2dAUQvxg 2D Maps Build a Hi-C Interaction Map Viewer
  41. 41. http://higlass.io/app/?config=TKXaqsSIRvGEcw2dAUQvxg 2D Maps
  42. 42. 1D Tracks Build a Genome Browser
  43. 43. 1D Tracks
  44. 44. Prioritization Orient Users in the Visualization
  45. 45. Prioritization
  46. 46. Prioritization
  47. 47. Prioritization
  48. 48. Linked Views Support Overview and Detail
  49. 49. Linked Views
  50. 50. Linked Views++ Support Exploration and Analysis
  51. 51. Linked Views++
  52. 52. Example 1a Schwarzer et al. Nature, 2017
  53. 53. Example 1a
  54. 54. Example 1b Schwarzer et al. Nature, 2017
  55. 55. Example 1b
  56. 56. Example 2 Forcato et al. Nature Methods, 2017
  57. 57. Example 2
  58. 58. Many pattern instances but sparse distribution! How can we explore and compare many local patterns in this very large matrix?
  59. 59. HiPiler http://hipiler.higlass.io
  60. 60. Challenges • Detected by algorithms • Occur frequently • "Noisy" results Goals • Quality assessment • Pattern stratification • Pattern correlation Points Blocks
  61. 61. • How do specific pattern or average pattern look? • How variable and noisy are detected patterns? • Are there subgroups among the pattern? • How are patterns related to other data attributes? • What does the patterns neighborhood look like?
  62. 62. TECHNIQUES? • Pan & Zoom
 Kerpedjiev et al.: HiGlass • Lenses / Multifocus
 Rao and Card: Table Lens
 Elmquist et al.: Melange • Abstraction / Aggregation
 Dunne et al.: Motif Simplification
 Elmquist et al.: ZAME • Small Multiples
 Bach et al.: Multipiles
  63. 63. Cut the Matrix into Pieces!
  64. 64. Cut the Matrix into Pieces!
  65. 65. Cut the Matrix into Pieces!
  66. 66. Cut the Matrix into Pieces!
  67. 67. Cut the Matrix into Pieces!
  68. 68. HiPiler
  69. 69. HiPiler
  70. 70. HiPiler
  71. 71. HiPiler
  72. 72. HiPiler
  73. 73. HiPiler
  74. 74. 1. FILTERING Assess quality & separate signal from noise
  75. 75. 1. FILTERING
  76. 76. 1. FILTERING
  77. 77. 1. FILTERING
  78. 78. 1. FILTERING
  79. 79. 1. FILTERING
  80. 80. 2. AGGREGATE Stratify patterns and assess pattern variability
  81. 81. 2. AGGREGATE
  82. 82. 2. AGGREGATE
  83. 83. 2. AGGREGATE
  84. 84. 2. AGGREGATE
  85. 85. 3. CONTEXT Correlate patterns with each another & other pattern types
  86. 86. 3. CONTEXT
  87. 87. 3. CONTEXT
  88. 88. 3. CONTEXT
  89. 89. Pile Inspection Attribute correlations Multidimensional Clustering Dataset Comparison
  90. 90. HiGlass HiPiler Investigate local and global interactions Small number of features at a time Strong focus on local context Investigate features across the whole map View hundreds or thousands of features at a time Weak support for context
  91. 91. HiGlass HiPiler Investigate local and global interactions Small number of features at a time Strong focus on local context Investigate features across the whole map View hundreds or thousands of features at a time Weak support for context ?
  92. 92. HiGlass HiPiler Investigate local and global interactions Small number of features at a time Strong focus on local context Investigate features across the whole map View hundreds or thousands of features at a time Weak support for context Dynamic Aggregatable Insets
  93. 93. Dynamic Aggregatable Insets Lekschas et al., Work in Progress
  94. 94. Open Challenges
  95. 95. Open Challenges Integration with Imaging Data 
 Multi contact data 
 Single cell data 
 Visualization of temporal dynamics

  96. 96. Acknowledgements Peter Kerpedjiev, PhD Fritz Lekschas, MSc HARVARD MEDICAL SCHOOL HARVARD SCHOOL OF ENGINEERING & APPLIED SCIENCES Funding provided by NIH COMMON FUND (U01 CA200059) NIH NATIONAL HUMAN GENOME RESEARCH INSTITUTE (R00 HG007583)
  97. 97. Acknowledgements Peter Kerpedjiev Fritz Lekschas Nezar Abdennur Benjamin Bach Chuck McCallum Kasper Dinkla Hendrik Strobelt Jacob M Luber Scott B Ouellette Alaleh Ahzir Nikhil Kumar Jeewon Hwang Danielle Nguyen Burak H Alver Job Dekker Hanspeter Pfister Leonid A Mirny Peter J Park
  98. 98. Nils Gehlenborg, PhD http://gehlenborglab.org Visualization of 3D Genome Data HARVARD MEDICAL SCHOOL DEPARTMENT OF BIOMEDICAL INFORMATICS @ngehlenborg
  99. 99. Tools Demo Site: http://higlass.io Code: https://github.com/hms-dbmi/higlass Docker: https://hub.docker.com/r/gehlenborglab/higlass/ Preprint: https://doi.org/10.1101/121889 HIGLASS HIPILER Demo Site: http://hipiler.higlass.io Code: https://github.com/flekschas/hipiler Preprint: https://doi.org/10.1101/123588 Paper: http://doi.org/10.1109/TVCG.2017.2745978 IEEE TVCG (2018)

×