- Effective Visualization with Cytoscape
- Use Cytoscape with external data
analysis tools
- Cytoscape and The Web
Part 2: Agenda
- This section is a bit conceptual
rather than practical, but it is
very important to understand
before creating actual data
visualizations
Part 2: Agenda
Now, you know…
- Basic features of Cytoscape
- How to load network / table data
- Basic Analysis / Filtering
- Layout
- Edit Visual Styles
- Ready to create great visualizations!
I’m not a designer…
- But learning basic principles of design
and data visualization is not so hard
- Creating 10/10 visualization is difficult,
but 8/10 is the goal for us
- Let’s avoid pitfalls!
What is BAD Visualization?
- Lack of story
- What’s the point?
- Hard to understand
- Too many or too few
visual mappings
- Ugly
Story (or Goal)
- Example:
- I want to show the changing levels
of gene expression for three time
points
- Assign gene expression profile
to the primary visual property in
your visualization
“Cool” does not always
mean “Effective”
- This is what I’ve learned from my past experiences…
Case Study:
3D Visualization
- Background:
- In late 90’s, 3D graphics card was
cheap enough for entry-level
workstations
- Many researchers made tons of 3D
graphics applications for data
visualization
What was the problem?
… It would be more accurate
to say that visual space has
2.05 dimensions.
Lessons Learned…
- Introduce additional dimension / complexity to the
visualization only when it is necessary
- Animation, 3D, charts on nodes, etc.
- Use minimal set of visual channels to make
the visualization understandable
- Define story (or goal) before creating actual
visualization
- Understand human perception
Goal of Scientific Data
Visualization
- Help scientists to understand their data sets
- Tell a STORY
- Just follow some simple principles
- Info-Graphics != Data Visualization
- Art/Design : Science
- Infographics 8:2
- Scientific Visualization 1:9
You Don’t Have to be a
Professional Designer
What is Good Visualization?
http://www.visualcomplexity.com/vc/
- One of the unfortunate trends in data-driven
life sciences is that they increasingly use
programmers to abstract data so that
mundane information looks visually appealing -
this is motivated by the desire to appear on the
cover of the glossy life sciences journals.
- Comment from Wired Magazine article “Circle of Life: The
Beautiful New Way to Visualize Biological Data”
http://www.wired.com/wiredscience/2013/11/wired-data-life-martin-krzywinski/
Don’t be Too Cool!
- Cool visualizations are sometime useless for
scientists
- But still good for journal cover page…
- Balance coolness and effectiveness
- Think about audience (or users if it is
interactive)
Visualizing Heterogeneous
Data In a Diagram is HARD
- Visualization itself is a research area
- You should learn about commonly used
techniques and principles from experts
Targeting the Audience
- Even meaningless (but cool) visualization is
useful as a eye-catcher or journal cover
page
- When you need figures for your publication,
minimize the noise in your visualization and
keep it simple
In Cytoscape
- Node Size / Edge Width
- Two strongest visual channels for
mapping your data
- Use these two for your important
data
- Automatic layout algorithms can be
applied only to selected group of nodes
In Cytoscape
- Node/Edge/Label Color
- Less accurate, but still useful
especially when you map to
continuous values
- Automatic layout algorithms can be
applied only to selected group of nodes
In Cytoscape
- Node/Edge/Label Transparency
- Use to emphasize important region
of the network
- Density of connections
- Use edge bundling for dense network
Avoid Data Overload
- Mapping too many attributes makes your
visualization awful!
- It is hard to see the overall trend if too
many channels are used in a image
External Tools
- Biological data analysis is not simple!
- There is no such thing: one-size-fits-all
- Need to understand de-facto standard
tools to save your time
Network Data Analysis
Analysis
Graph Analysis
NetworkX
igraph
Cytoscape
Python
Pandas
NumPy
SciPy
Excel
Visualization
Desktop
Gephi
Cytoscape
matplotlib
Web
Cytoscape.js
sigma.js
d3
NDV3
d3.chart
Google Charts
Data Storage
Graph
Neo4j
GraphX
Document
MongoDB
Relational
MySQL
IPython
3rd Party Apps
NetworkAnalyzer
Network Data Ana
Analysis
Graph Analysis
NetworkX
igraph
Cytoscape
Python
Pandas
NumPy
SciPy
Excel
Visua
IPython
3rd Party Apps
NetworkAnalyzer
Data Analysis Tools
- Languages / Platforms
- R + Bioconductor
- Python + Pandas
- MATLAB
- Excel
- Graph analysis library
- igraph
- NetworkX
Data Visualization Tools
- Data visualization on web
browsers are getting more
and more important…
- Cytoscape.js
- sigma.js
- D3.js
- Need more analysis functions
- Cytoscape can perform network
analysis interactively, but does not have
complete suite of network data analysis
tools
- These days, cutting-edge methods and
algorithms are implemented in Python
- Easy to implement, yet fast
(because of NumPy/SciPy)
- Batch analysis
- Visualize in web browsers
Why Multiple Tools?
- Avoid reinventing the wheel
- igraph and NetworkX have a lots of
network analysis functions. Why
should we repeat it again?
- Collaboration rather than competition
- General policy for our project
Why Multiple Tools?
Glue for Applications
- There are two ways to use
external tools with Cytoscape
- Common file formats
- RESTful API for programatic
access (Ongoing)
- Use popular, standard, widely-used data formats
!
- GraphML (Recommended)
- CSV/TSV
- Not a format, but easy to process in
scripting languages and spreadsheet
File-Base Data Exchange
Realistic Example
- Prepare data in Python
- Load data from Bioconductor
- Calculate network statistics with
igraph
- Export networks and tables in
GraphML format
- Visualize it in Cytoscape
Realistic Example
- Prepare data in Python
- Load data from Bioconductor
- Calculate network statistics with
igraph
- Export networks and tables in
GraphML format
- Visualize it in Cytoscape
Coming Soon…
- Programatic access to
Cytoscape objects and
functions via REST
- /networks/ID/nodes/NODEID
- /apply/layout?network=ID
- We need your opinion!