SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
Cytoscape Tutorial Session 2 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
2.
- Effective Visualization with Cytoscape
- Use Cytoscape with external data
analysis tools
- Cytoscape and The Web
Part 2: Agenda
3.
- This section is a bit conceptual
rather than practical, but it is
very important to understand
before creating actual data
visualizations
Part 2: Agenda
5.
Now, you know…
- Basic features of Cytoscape
- How to load network / table data
- Basic Analysis / Filtering
- Layout
- Edit Visual Styles
- Ready to create great visualizations!
10.
I’m not a designer…
- But learning basic principles of design
and data visualization is not so hard
- Creating 10/10 visualization is difficult,
but 8/10 is the goal for us
- Let’s avoid pitfalls!
11.
What is BAD Visualization?
- Lack of story
- What’s the point?
- Hard to understand
- Too many or too few
visual mappings
- Ugly
12.
Story (or Goal)
- Example:
- I want to show the changing levels
of gene expression for three time
points
- Assign gene expression profile
to the primary visual property in
your visualization
13.
MUD
HAP4
GC
HA
GAL1
GAL7
GAL80
GAL3
GAL11
GAL4
GAL2
SIP4
FBP1
GAL10
SWI5
SUC2
MIG1
ADH1
PGK1
CDC19
GCR1
CBF1
ENO1
ENO2
MCK1
NCE103
SSL2
TFB1
YNL091W
TRP4
ARG1
GCN4
SKO1
HIS3
ADE4 ILV2
RPS17A
BAS1
HIS7
RPS24B
MSL1
HIS4
PDC5
PHO84
PHO4
YIL105C
MET16
RPL11B
RPS8B
RPL11A
RPL31A
PHO13
PDC1
SXM1
RPL34B
RPL16B
ATC1
CAR1
FCY1
ICL1SRP1
TPI1
RPL18B
RPL25
PHO5
RPS24A
RPL18A
DMC1
RAP1
RPL16A
HSP42
Map gene expression
values to color
Avoid using more colors in other
components (edge/label)
If necessary, map other data
into non-overlapping visual properties
(edge score to width)
14.
“Cool” does not always
mean “Effective”
- This is what I’ve learned from my past experiences…
15.
Case Study:
3D Visualization
- Background:
- In late 90’s, 3D graphics card was
cheap enough for entry-level
workstations
- Many researchers made tons of 3D
graphics applications for data
visualization
22.
What was the problem?
… It would be more accurate
to say that visual space has
2.05 dimensions.
23.
Lessons Learned…
- Introduce additional dimension / complexity to the
visualization only when it is necessary
- Animation, 3D, charts on nodes, etc.
- Use minimal set of visual channels to make
the visualization understandable
- Define story (or goal) before creating actual
visualization
- Understand human perception
24.
Goal of Scientific Data
Visualization
- Help scientists to understand their data sets
- Tell a STORY
25.
- Just follow some simple principles
- Info-Graphics != Data Visualization
- Art/Design : Science
- Infographics 8:2
- Scientific Visualization 1:9
You Don’t Have to be a
Professional Designer
26.
What is Good Visualization?
http://www.visualcomplexity.com/vc/
27.
- One of the unfortunate trends in data-driven
life sciences is that they increasingly use
programmers to abstract data so that
mundane information looks visually appealing -
this is motivated by the desire to appear on the
cover of the glossy life sciences journals.
- Comment from Wired Magazine article “Circle of Life: The
Beautiful New Way to Visualize Biological Data”
http://www.wired.com/wiredscience/2013/11/wired-data-life-martin-krzywinski/
28.
An Extreme Example
(I’m not saying this is bad, but…)
30.
Don’t be Too Cool!
- Cool visualizations are sometime useless for
scientists
- But still good for journal cover page…
- Balance coolness and effectiveness
- Think about audience (or users if it is
interactive)
31.
Visualizing Heterogeneous
Data In a Diagram is HARD
- Visualization itself is a research area
- You should learn about commonly used
techniques and principles from experts
32.
Human Interactome data from BioGRID visualized by Cytoscape
33.
Large Scale Visualizations
are Pointless in Many Cases
39.
Targeting the Audience
- Even meaningless (but cool) visualization is
useful as a eye-catcher or journal cover
page
- When you need figures for your publication,
minimize the noise in your visualization and
keep it simple
40.
Data Visualization Tools
http://selection.datavisualization.ch/
65.
In Cytoscape
- Node Size / Edge Width
- Two strongest visual channels for
mapping your data
- Use these two for your important
data
- Automatic layout algorithms can be
applied only to selected group of nodes
70.
In Cytoscape
- Node/Edge/Label Color
- Less accurate, but still useful
especially when you map to
continuous values
- Automatic layout algorithms can be
applied only to selected group of nodes
75.
In Cytoscape
- Node/Edge/Label Transparency
- Use to emphasize important region
of the network
- Density of connections
- Use edge bundling for dense network
78.
Avoid Data Overload
- Mapping too many attributes makes your
visualization awful!
- It is hard to see the overall trend if too
many channels are used in a image
81.
# of Visual
Properties is Limited
- Use them effectively
- Don’t use too much
in the same view
82.
STMN1
SMARCD3
SMARCA4
SMARCD3
TUBB
HTT
OPTN
PPARG
PSMD1
MAP4K4
ATP6V1C1
Start from Scratch
- If you are not sure you
need the decoration or
not, remove it
- Example: Node border,
edge arrow
- Even labels are not always
required!
86.
External Tools
- Biological data analysis is not simple!
- There is no such thing: one-size-fits-all
- Need to understand de-facto standard
tools to save your time
87.
Network Data Analysis
Analysis
Graph Analysis
NetworkX
igraph
Cytoscape
Python
Pandas
NumPy
SciPy
Excel
Visualization
Desktop
Gephi
Cytoscape
matplotlib
Web
Cytoscape.js
sigma.js
d3
NDV3
d3.chart
Google Charts
Data Storage
Graph
Neo4j
GraphX
Document
MongoDB
Relational
MySQL
IPython
3rd Party Apps
NetworkAnalyzer
88.
Network Data Ana
Analysis
Graph Analysis
NetworkX
igraph
Cytoscape
Python
Pandas
NumPy
SciPy
Excel
Visua
IPython
3rd Party Apps
NetworkAnalyzer
89.
Data Analysis Tools
Analysis VisualizationData Preparation
90.
Data Analysis Tools
- Languages / Platforms
- R + Bioconductor
- Python + Pandas
- MATLAB
- Excel
- Graph analysis library
- igraph
- NetworkX
91.
Data Visualization Tools
- Data visualization on web
browsers are getting more
and more important…
- Cytoscape.js
- sigma.js
- D3.js
92.
- Need more analysis functions
- Cytoscape can perform network
analysis interactively, but does not have
complete suite of network data analysis
tools
- These days, cutting-edge methods and
algorithms are implemented in Python
- Easy to implement, yet fast
(because of NumPy/SciPy)
- Batch analysis
- Visualize in web browsers
Why Multiple Tools?
93.
- Avoid reinventing the wheel
- igraph and NetworkX have a lots of
network analysis functions. Why
should we repeat it again?
- Collaboration rather than competition
- General policy for our project
Why Multiple Tools?
94.
Glue for Applications
- There are two ways to use
external tools with Cytoscape
- Common file formats
- RESTful API for programatic
access (Ongoing)
95.
- Use popular, standard, widely-used data formats
!
- GraphML (Recommended)
- CSV/TSV
- Not a format, but easy to process in
scripting languages and spreadsheet
File-Base Data Exchange
96.
Realistic Example
- Prepare data in Python
- Load data from Bioconductor
- Calculate network statistics with
igraph
- Export networks and tables in
GraphML format
- Visualize it in Cytoscape
97.
Realistic Example
- Prepare data in Python
- Load data from Bioconductor
- Calculate network statistics with
igraph
- Export networks and tables in
GraphML format
- Visualize it in Cytoscape
98.
Coming Soon…
- Programatic access to
Cytoscape objects and
functions via REST
- /networks/ID/nodes/NODEID
- /apply/layout?network=ID
- We need your opinion!
99.
Communication Bus
NDEx (DB)
Browser
Cytoscape Desktop
101.
- Prepare/integrate/analyze data
with R/Python or traditional
desktop applications
- Visualize & publish it as web apps
Trends in Data Visualization