Keiichiro ONO 大野圭一朗
第一回バイオインフォマティクス可視化セミナー (12/26/2018)
バイオインフォマティクス分野における
可視化アプリケーション構築と維持の実際
Design,implementation,and maintenance of data
visualization applications for bioinformatics
About me…
Del Mar,CA
‣Keiichiro ONO
‣ Bioinformatics software engineer 

@UC,San Diego Trey Ideker Lab
‣ Cytoscape Consortium member since 2005
‣ National Resource for Network Biology (NRNB)
‣ Data Visualization Japan
Our Projects
Cytoscape
Desktop
Open source platform for network analysis
and visualization
(2003 - )
Cytoscape.js:
Library for building
network visualization
web applications
Current main project:
From omics-data to
hierarchy to phenotype
A gene ontology inferred from molecular networks. Dutkowski J,
Kramer M,Surma MA,Balakrishnan R,Cherry JM,Krogan NJ,Ideker
T., Nature Biotechnology 2013 Jan;31(1):38-45.
NeXO Web: the NeXO ontology database and visualization
platform. Dutkowski J, Ono K,Kramer M, Yu M, Pratt D,Demchak B,
Ideker T., Nucleic Acids Res. 2014 Jan;42(Database issue).
Using deep learning to model the hierarchical structure and function of a cell.
Ma J,Yu MK,Fong S,Ono K,Sage E,Demchak B,Sharan R,Ideker T. Nat Methods. 2018 Mar 5. doi: 10.1038/
nmeth.4627.PMID: 29505029
HiView:
Universal Browser for hierarchical data
Overview:
Design,implementation,and maintenance of
(long-lived) data visualization application for
bioinformatics
15 years ago…
Cytoscape V1
2003
No Gmail
No Google Maps
2003 (from wikipedia)
Cytoscape V1
2003
• Java-Swing GUI application
• For multiple platform support
• Support for plugins
• Basic architecture was modular and “modern”
• Slow…
• Designed to handle few hundreds of nodes and edges
2018:
Cytoscape V3.7
2018
Cytoscape V3.7
2018
• Still a Java-Swing GUI application
• With a lot of optimization and improvements
• Basic architecture is still the same - Visual Style, Filter, Plugins/Apps
• Polyglot - Core is still Java, but provides REST API for multiple language support
• MUCH faster
• Designed to visualize tens of thousands of nodes and edges
Hardest project so far:
Migration from v2.0 to 3.0
= Breaking API change
• Breaking API changes
• OK for us, but NOT for 3rd party developers!
• There was over few hundreds of 3rd party apps for v2.x
• Had a lot of tutorial sessions and workshops for Plugin (a.k.a. App) developers
• With a lot of optimization and improvements
• Introduced OSGi - Modular Java
• Took us almost 5 years to complete the migration…
Migration from 2.x to 3.x
Lessons learned:
Breaking API change is extremely
expensive if the application has
large 3rd party developer community
2019:
Cytoscape 4.0 or Cytoscape Cloud?
Expandability in
Cytoscape
2003 → 2018:
Plugins → Apps → Scripting
→ REST API → Web Integration
Plugins / Apps
• Expansions / Add-ons written in Java
• Still acceptable, but huge overhead
for bioinformaticians
Scripting
• Support for languages implemented on
top of JVM
• JRuby
• Jython
• JavaScript (Rhino)
• Not so successful due to the limitations
of the JVM scripting languages
REST API
• Language-agnostic
• Enable users to automate
their workflows
https://cytoscape.org/cytoscape-tutorials/presentations/advanced-automation-2018-sib.html#/
Web Integration
• Implement UI with web
technologies
• Use external web browser
as a new user interface
Why we should use
web technologies to build
complex data viz apps for biology?
• Standard platform for data visualization 

= Web browser
• Toolchain / Frameworks
• Documents
• Developers!
Toolchain / Frameworks
• Most popular, modern frameworks for advanced data visualization are available
for the web browsers
• Browsers are ubiquitous - from phones to workstations
Documents
• Books
• Papers
• Reference Implementation
• Source code
Developers
• It is hard to find good developers for (relatively) old
technology…
• Learn Java Swing in 2019…?
• It is important to attract young developers
• Otherwise, the community eventually dies
…and data analysis tools are
always available in R/Python
(this statement may be too strong,especially Julia users, but in reality,this is true from the sponsor’s point of view)
• Workbench for data analysis and
visualization
• Notebook applications
• Jupyter Notebook
• R Markdown
• IDE-like applications
• Jupyter Lab
These are the tools for modern data
analysis/viz application developers
But we cannot trash existing
application ecosystem!
Web Apps
Notebooks
Don’t trash,connect!
Pros and Cons of Loosely-Coupled
Multiple Client Application Approach
• Pros
• No rewriting
• From Python to Java, etc.
• Huge ecosystem
• Access to de-facto standard
tools
• No reinventing-the-wheel
• Cons
• UX
• Window management
• Modality
• Environment management
• Dependencies
There is no perfect solution,
but polyglot,heterogeneous
environment is the standard today
(and we should support it!)
Problems in bioinformatics data
viz application development
• Needs for custom data visualization apps
• BI tools, such as Tableau/Spotfire are not always good enough for
researchers’ use cases
• Limited resources
• Even the biggest (academic) lab in the world is much smaller than the tech
giants…
Case Study:
Expanding
The Cytoscape Ecosystem
Or,implementing Cytoscape 4.0 without
implementing Cytoscape 4.0 Java API
Goal:
Create a loosely-coupled collection
of tools for network biology
Web Apps for new experimental visualizations
Notebooks
Existing software suite
Extension for network visualization
1. New widgets/components for Jupyter Notebook / Lab
2. New browser for hierarchical data
3. Connect them to existing apps
Notebook Widgets
CyJupyter:
Cytoscape.js widget for
Jupyter Notebook
• Standard notebook widget
• Simple network rendering only
CyJupyterLab:
Cytoscape.js extension for
JupyterLab
• JupyterLab Extension
• Pure JavaScript (TypeScript)
module
• React-based
Why two versions?
• JupterLab notebook panel is not 100% compatible with Jupyter Notebook
• e.g. local search
• Not all Jupyter Notebook users are migrating to JupyterLab
• Some prefer simplicity of the Notebook
JupyterLab
• IDE-like UX
• Better internal architecture
• Flexible API
• Best for new projects
Jupyter Notebook
• Simple, intuitive UI
• The defect standard
• Limited interactivity
Which one should we use???
Jupyter Notebook /
Lab Quick Demo:
Designing New Data
Visualization for Biology
Basic Tools:
D3/Cyjs + React
Observable:
Good tool for sketching
HiView 2.0
S:1 nuclear part
nucleoplasm
cytosol
regionalization
S:42
voltage-gated potassium channel activity leak channel activity
S:1320
HiView: Demo
Best practices for building complex
data viz application for biology
1.Avoid Not Invented Here Syndrome!
There is no need to
implement bar chart library
from scratch using D3!!!
Use library with higher-level API
2.Watch trends in frontend framework
Technology
cytoscape.js
State of JavaScript
https://2018.stateofjs.com/introduction
3.Don’t dig too deep:
Try to use library with high-level API
Example:
Uber Data Viz Stack
Summary:
• Maintaining a long-lived client application is HARD
• Always accept changes
• Technology is always changing
• Don’t stick to a language / platform
• Support popular / standard platforms

第1回バイオインフォマティクスデータ可視化セミナー@Riken