Keiichiro Ono
UCSD Trey Ideker Lab
Cytoscape Core Team
Lab Meeting
Aug 4, 2015
Building Reproducible Network Data Analysis / Visualization Workflows
REST
Problems We are Trying to Solve
- Complex software stack for data analysis
- Setting up environment for data analysis is not trivial, and it is time-
consuming
- Python 3.x or 2.x/NumPy/SciPy/Cython Modules
- R/Bioconductor/packages
- OS version, etc.
- Automation
- Point-and-Click operations are not reproducible
- Applying different layouts to 100 networks by hand is possible, but ridiculous
- Sharing Recipe (= common workflows) is hard
- Integration to external computing resources
Goal: Reproducible,Scalable Dry Experiments
REST
REST
- Docker
- Data analysis environment in a portable container
- GitHub
- Source code sharing
- Jupyter Notebook
- Your electronic lab notebook
- cyREST
- RESTful API module for Cytoscape
Goal: Reproducible, Scalable Dry Experiments
Data
Preparation
Analysis Visualization
REST
Scenario 1: Everything on your Workstation
Notebook Server
Your Jupyter Notebook
REST
Scenario 2: Workstation + Cloud
Notebook Server
Your Jupyter Notebook
Example: Community-Detection + Edge-Weighted Layout
Source Code: bit.ly/1P4LUFU
Demo
TODO
- Integration to Cyberinfrastructure (CI)
- R Wrapper
- https://github.com/tmuetze/
Bioconductor_RCy3_the_new_RCytoscape
- More realistic workflows / pipelines
Resources
- cyREST
- http://apps.cytoscape.org/apps/cyrest
- py2cytoscape
- https://pypi.python.org/pypi/py2cytoscape
- RCy3
- https://github.com/tmuetze/
Bioconductor_RCy3_the_new_RCytoscape

Building Reproducible Network Data Analysis / Visualization Workflows

  • 1.
    Keiichiro Ono UCSD TreyIdeker Lab Cytoscape Core Team Lab Meeting Aug 4, 2015 Building Reproducible Network Data Analysis / Visualization Workflows REST
  • 2.
    Problems We areTrying to Solve - Complex software stack for data analysis - Setting up environment for data analysis is not trivial, and it is time- consuming - Python 3.x or 2.x/NumPy/SciPy/Cython Modules - R/Bioconductor/packages - OS version, etc. - Automation - Point-and-Click operations are not reproducible - Applying different layouts to 100 networks by hand is possible, but ridiculous - Sharing Recipe (= common workflows) is hard - Integration to external computing resources
  • 3.
  • 4.
    REST - Docker - Dataanalysis environment in a portable container - GitHub - Source code sharing - Jupyter Notebook - Your electronic lab notebook - cyREST - RESTful API module for Cytoscape Goal: Reproducible, Scalable Dry Experiments
  • 5.
  • 6.
    REST Scenario 1: Everythingon your Workstation Notebook Server Your Jupyter Notebook
  • 7.
    REST Scenario 2: Workstation+ Cloud Notebook Server Your Jupyter Notebook
  • 8.
    Example: Community-Detection +Edge-Weighted Layout Source Code: bit.ly/1P4LUFU
  • 9.
  • 10.
    TODO - Integration toCyberinfrastructure (CI) - R Wrapper - https://github.com/tmuetze/ Bioconductor_RCy3_the_new_RCytoscape - More realistic workflows / pipelines
  • 11.
    Resources - cyREST - http://apps.cytoscape.org/apps/cyrest -py2cytoscape - https://pypi.python.org/pypi/py2cytoscape - RCy3 - https://github.com/tmuetze/ Bioconductor_RCy3_the_new_RCytoscape