Nils Gehlenborg, PhD
http://gehlenborglab.org
Visualization of
3D Genome Data
HARVARD MEDICAL SCHOOL
DEPARTMENT OF BIOMEDICAL INFORMATICS
@ngehlenborg
DNA in the Nucleus
https://upload.wikimedia.org/wikipedia/commons/7/7a/Basketball.png
https://upload.wikimedia.org/wikipedia/commons/7/7a/Basketball.png
http://simplemaps.com/resources/svg-us
Dekker et al., Nature, 2017
Why is 3D Genome Data interesting?
Role of 3D DNA Structure & Dynamics
Cell Division
Gene Regulation
Structural Variation
Dekker et al., Nature, 2017
Cell Division
Gene Regulation
Enhancers: spatial proximity to control gene expression
Clustering of chromatin near lamina: gene silencing
GWAS: many variants found in non-coding regions
Structural Variation
Dekker et al., Nature, 2017
Cell Division
Gene Regulation
Structural Variation
Spatial arrangement influences structural variation
Dekker et al., Nature, 2017
Role of 3D DNA Structure & Dynamics
Cell Division
Gene Regulation
Structural Variation
Dekker et al., Nature, 2017
How do we measure chromosomal
conformation?
De Wit and De Laat, Genes & Development, 2012
Chromosome Conformation Capture
De Wit and De Laat, Genes & Development, 2012
De Wit and De Laat, Genes & Development, 2012
Rao et al., Cell, 2014
Hi-C Protocol
Genome-wide
Contact Matrix
How big is a Hi-C Interaction Matrix?
3,000,000 x
3,000,000 pixels
Printed at 300 DPI
~250 x 250 meters
~830 x 830 feet
How big is a Hi-C Interaction Matrix?
Typical resolution today:
Reads mapped into 1,000 bp bins
→ ~3,000,000 x 3,000,000 matrix
for a whole genome
Printed at 300 DPI
~250 x 250 meters
~830 x 830 feet
3,000,000 x
3,000,000 pixels
Printed at 300 DPI
~250 x 250 meters
~830 x 830 feet
By Sam valadi - https://www.flickr.com/photos/132084522@N05/17178926219/in/photostream/
3,000,000 x
3,000,000 pixels
Printed at 300 DPI
~250 x 250 meters
~830 x 830 feet
We need a bigger screen!
http://higlass.io/app/?config=dyE970c4TH21onnRvT1PmQ
What are some visualization
challenges involved?
1. View interactions at different scales

From genome to individual bins
2. Compare interactions across many conditions

Two or more conditions
3. View and compare features

Within and across maps
4. Navigating an enormous data space

With few well known landmarks
5. Do all of this in a web browser

Interaction and low latency
How can we visualize the data?
Network Visualization
1. (3M x 3M)/2 interactions
2. Weight for each interaction
3. Constraint: nodes in sequence order
MatrixNode-Link
Diagram
Network Visualization
1. (3M x 3M)/2 interactions
2. Weight for each interaction
3. Constraint: nodes in sequence order
Genome Interaction Data Visualization
Scale
1. Global Interactions (whole chromosome or genome)
2. Local Interactions (immediate feature neighborhood)
3. Individual Features
Encoding
1. Heatmap
2. Node-Link Diagram (here: Arc Diagram)
3. 3D
Genome Interaction Data Visualization
Scale
1. Global Interactions (whole chromosome or genome)
2. Local Interactions (immediate feature neighborhood)
3. Individual Features
Encoding
1. Heatmap
2. Node-Link Diagram (here: Arc Diagram)
3. 3D Illustration of concepts and models!
Genome Interaction Data Visualization
Reviewed in Yardımcı & Noble, Genome Biology, 2017, http://aidenlab.org/juicebox/
Global Interactions
Juicebox
HEATMAP
Caveat

only qualitative interpretation of color map possible
Wong 2010, Nature Methods & https://en.wikipedia.org/w/index.php?curid=45522095
Mini Excursion: Color
Color is a relative medium!
Reviewed in Yardımcı & Noble, Genome Biology, 2017, http://aidenlab.org/juicebox/
Global Interactions
Juicebox
HEATMAP
Caveat

only qualitative interpretation of color map possible
DOI 10.1101/121889, http://higlass.io, Kerpedjiev, Abdennur, Lekschas …, Mirny, Park, Gehlenborg
Global Interactions
HiGlass
HEATMAP
Caveat

only qualitative interpretation of color map possible
Reviewed in Yardımcı & Noble, Genome Biology, 2017, http://epigenomegateway.wustl.edu/
Global Interactions
Washington University
Epigenome Browser
ARC DIAGRAM
Caveats

line crossings, limited dynamic range
zooming complex
http://rondo.ws, O’Donoghue Lab
Global Interactions
Rondo
ARC DIAGRAM
Caveats

line crossings, limited dynamic range
http://rondo.ws, O’Donoghue Lab
Global Interactions
Rondo
ARC DIAGRAM
Caveat

line crossings, colors hard to map
http://promoter.bx.psu.edu/hi-c/, Reviewed in Yardımcı & Noble, Genome Biology, 2017
Local Interactions
3D Genome Browser
HEATSTRIP
Caveat

height of triangle grows with
distance of interaction
Reviewed in Yardımcı & Noble, Genome Biology, 2017, http://epigenomegateway.wustl.edu/
Local Interactions
Washington University
Epigenome Browser
ARC DIAGRAM
Caveats

zooming is problematic, no context
Interaction & Navigation
http://higlass.io
http://higlass.io/app/?config=TKXaqsSIRvGEcw2dAUQvxg
2D Maps
Build a Hi-C Interaction Map Viewer
http://higlass.io/app/?config=TKXaqsSIRvGEcw2dAUQvxg
2D Maps
1D Tracks
Build a Genome Browser
1D Tracks
Prioritization
Orient Users in the Visualization
Prioritization
Prioritization
Prioritization
Linked Views
Support Overview and Detail
Linked Views
Linked Views++
Support Exploration and Analysis
Linked Views++
Example 1a
Schwarzer et al. Nature, 2017
Example 1a
Example 1b
Schwarzer et al. Nature, 2017
Example 1b
Example 2
Forcato et al. Nature Methods, 2017
Example 2
Many pattern instances but sparse distribution!
How can we
explore and compare
many local patterns in
this very large matrix?
HiPiler
http://hipiler.higlass.io
Challenges
• Detected by algorithms
• Occur frequently
• "Noisy" results
Goals
• Quality assessment
• Pattern stratification
• Pattern correlation
Points Blocks
• How do specific pattern or
average pattern look?
• How variable and noisy are
detected patterns?
• Are there subgroups among
the pattern?
• How are patterns related to
other data attributes?
• What does the patterns
neighborhood look like?
TECHNIQUES?
• Pan & Zoom

Kerpedjiev et al.: HiGlass
• Lenses / Multifocus

Rao and Card: Table Lens

Elmquist et al.: Melange
• Abstraction / Aggregation

Dunne et al.: Motif Simplification

Elmquist et al.: ZAME
• Small Multiples

Bach et al.: Multipiles
Cut the Matrix into Pieces!
Cut the Matrix into Pieces!
Cut the Matrix into Pieces!
Cut the Matrix into Pieces!
Cut the Matrix into Pieces!
HiPiler
HiPiler
HiPiler
HiPiler
HiPiler
HiPiler
1. FILTERING
Assess quality & separate signal from noise
1. FILTERING
1. FILTERING
1. FILTERING
1. FILTERING
1. FILTERING
2. AGGREGATE
Stratify patterns and assess pattern variability
2. AGGREGATE
2. AGGREGATE
2. AGGREGATE
2. AGGREGATE
3. CONTEXT
Correlate patterns with each another
& other pattern types
3. CONTEXT
3. CONTEXT
3. CONTEXT
Pile Inspection Attribute correlations
Multidimensional Clustering Dataset Comparison
HiGlass HiPiler
Investigate local
and global
interactions
Small number of
features at a time
Strong focus on
local context
Investigate features
across the whole
map
View hundreds or
thousands of
features at a time
Weak support for
context
HiGlass HiPiler
Investigate local
and global
interactions
Small number of
features at a time
Strong focus on
local context
Investigate features
across the whole
map
View hundreds or
thousands of
features at a time
Weak support for
context
?
HiGlass HiPiler
Investigate local
and global
interactions
Small number of
features at a time
Strong focus on
local context
Investigate features
across the whole
map
View hundreds or
thousands of
features at a time
Weak support for
context
Dynamic
Aggregatable
Insets
Dynamic Aggregatable Insets
Lekschas et al., Work in Progress
Open Challenges
Open Challenges
Integration with Imaging Data


Multi contact data


Single cell data


Visualization of temporal dynamics

Acknowledgements
Peter Kerpedjiev, PhD Fritz Lekschas, MSc
HARVARD MEDICAL SCHOOL HARVARD SCHOOL OF ENGINEERING &
APPLIED SCIENCES
Funding provided by
NIH COMMON FUND (U01 CA200059)
NIH NATIONAL HUMAN GENOME RESEARCH INSTITUTE (R00 HG007583)
Acknowledgements
Peter Kerpedjiev
Fritz Lekschas
Nezar Abdennur
Benjamin Bach
Chuck McCallum
Kasper Dinkla
Hendrik Strobelt
Jacob M Luber
Scott B Ouellette
Alaleh Ahzir
Nikhil Kumar
Jeewon Hwang
Danielle Nguyen
Burak H Alver
Job Dekker
Hanspeter Pfister
Leonid A Mirny
Peter J Park
Nils Gehlenborg, PhD
http://gehlenborglab.org
Visualization of
3D Genome Data
HARVARD MEDICAL SCHOOL
DEPARTMENT OF BIOMEDICAL INFORMATICS
@ngehlenborg
Tools
Demo Site: http://higlass.io
Code: https://github.com/hms-dbmi/higlass
Docker: https://hub.docker.com/r/gehlenborglab/higlass/
Preprint: https://doi.org/10.1101/121889
HIGLASS
HIPILER
Demo Site: http://hipiler.higlass.io
Code: https://github.com/flekschas/hipiler
Preprint: https://doi.org/10.1101/123588
Paper: http://doi.org/10.1109/TVCG.2017.2745978 IEEE TVCG (2018)

Visualization of 3D Genome Data