Quick Introduction to Cytoscape for Undergraduates

1,262 views

Published on

Quick introduction to Cytoscape & tutorial for undergraduate students. 5/12/2014 @ UCSD

Published in: Science, Education, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,262
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
70
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Quick Introduction to Cytoscape for Undergraduates

  1. 1. Biological Network Visualization with Cytoscape Keiichiro Ono Cytoscape Core Developer Team UC, San Diego Trey Ideker Lab / National Resource for Network Biology ! 5/12/2014 Workshop for Undergraduate Bioinformatics Club at UCSD
  2. 2. Sample Data Files: http://cl.ly/VTJs
  3. 3. Made with Cytoscape
  4. 4. Keiichiro Ono Cytoscape Core Developer ! Area of Interest: Data Integration & Visualization
  5. 5. Keiichiro Ono Computer Science Biology
  6. 6. Keiichiro Ono Computer Science
  7. 7. Keiichiro Ono Data Visualization Programming: Java, JavaScript, Python, R, etc Software Engineering Web Development 
 Practitioner > Researcher
  8. 8. Outline • Part 1: Introduction to Cytoscape • What is Cytoscape? • Basic Features • Part 2: Hands-On Tutorial • Visualize gene expression values and network • Import data from public databases (optional)
  9. 9. What is Cytoscape?
  10. 10. An Open Source Platform for Biological Network Data Integration, Analysis and Visualization Cytoscape
  11. 11. Cytoscape - Open Source (LGPL) - Free for both commercial and academic use - Developed and maintained by universities, companies, and research institutions - De-facto standard software in biological network research community - Expandable by Apps - This is why Cytoscape is a Platform, not a simple desktop application
  12. 12. EP300 PPARG SMARCD3 STMN1 SMARCA4 OPTN ATP6V1C1 PSMD1 HTT PRNP HNRNPUL1 CCDC88A CLU HSP90AB1 SMARCD3 MAP4K4 MIF4GD USP11 MARCH6TUBB EDF1 CHD8 Protein-Protein Interactions
  13. 13. Directed Network KEGG Pathway (TCA Cycle) visualized by Cytoscape KGMLReader
  14. 14. Large-Scale Network Analysis and Visualization Human Interactome data from BioGRID visualized by Cytoscape
  15. 15. …But why we need such tool for biology?
  16. 16. C. Elegans Interactome from BioGRID Database ?
  17. 17. Biological Networks - Tell us anything by themselves - Just a big hairball…
  18. 18. Module 1 Module 2
  19. 19. In other words…
  20. 20. Module 1 Need a tool to extract meaningful biological modules
  21. 21. Basic Use Case
  22. 22. Networks Public Interaction Databases
  23. 23. List of Genes
  24. 24. Other Data
  25. 25. Network Data Analysis Analysis Graph Analysis NetworkX igraph Cytoscape Python Pandas NumPy SciPy Excel Visualization Desktop Gephi Cytoscape matplotlib Web Cytoscape.js sigma.js d3 NDV3 d3.chart Google Charts Data Storage Graph Neo4j GraphX Document MongoDB Relational MySQL IPython 3rd Party Apps NetworkAnalyzer
  26. 26. Network Data Analysis Analysis Visualization Desktop Gephi Cytoscape matplotlib Web Cytoscape.js sigma.js d3 NDV3 d3.chart Google Charts Data Storage
  27. 27. Network Data Analysis Analysis Graph Analysis NetworkX igraph Cytoscape Python Pandas NumPy SciPy Excel Visualization IPython 3rd Party Apps NetworkAnalyzer
  28. 28. 3 Basic Steps of Data Visualization with Cytoscape
  29. 29. <?xml version="1.0" encoding="UTF-8"?> <graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd"> <!-- Created by igraph --> <key id="degree" for="node" attr.name="degree" attr.type="double"/> <key id="betweenness" for="node" attr.name="betweenness" attr.type="double"/> <graph id="G" edgedefault="directed"> <node id="n0"> <data key="degree">79</data> <data key="betweenness">0</data> </node> <node id="n1"> <data key="degree">9</data> <data key="betweenness">167</data> </node> <node id="n2"> <data key="degree">18</data> <data key="betweenness">75</data> </node> <node id="n3"> <data key="degree">8</data> <data key="betweenness">12</data> </node> <node id="n4"> <data key="degree">26</data> <data key="betweenness">210</data> </node> <node id="n5"> <data key="degree">29</data> <data key="betweenness">320</data> </node> Data Integration
  30. 30. Analysis
  31. 31. Visualization
  32. 32. Network Data Annotated Networks Attributes Analyzed Data
  33. 33. Apps
  34. 34. Cytoscape Apps - Extension programs to add new features to Cytoscape (were called Plugins) - Large App developer/ user community - This is why Cytoscape is so successful in life science community!
  35. 35. (As of 4/5/2014) APPS.CYTOSCAPE.ORG
  36. 36. Quick Overview of Apps A travel guide to Cytoscape plugins ! Rintaro Saito, Michael E Smoot, Keiichiro Ono, Johannes Ruscheinski, Peng- Liang Wang, Samad Lotia, Alexander R Pico, Gary D Bader, Trey Ideker (2012) Nature Methods 9 (11) p. 1069-1076
  37. 37. Tips for Learning Tools
  38. 38. Choose a Right Tool
  39. 39. Choose a Right Tool Analysis VisualizationData Preparation
  40. 40. Data Visualization Tools http://selection.datavisualization.ch/
  41. 41. Data Visualization Tools http://selection.datavisualization.ch/
  42. 42. Data Visualization Tools http://selection.datavisualization.ch/
  43. 43. Tools • In some cases, you can finish exact same tasks using different tools • Example: Data preparation (cleansing) • But if you choose right tools, you can do it 100x faster than others. • ex: Re-formatting complex data sets • Excel vs Python Script • Some recommendations: • R/Bioconductor, Python/Pandas, Git/GitHub/Gist
  44. 44. Learning Tools = Saving Your Time
  45. 45. Hands-on: Introduction to Data Visualization with Cytoscape 50-60 min.
  46. 46. Data Visualization
  47. 47. - Goal: Help others to understand your data - Emphasize what you want to tell - Use color, shape, size of objects effectively! - Excellent resource for data visualization - Tamara Munzner’s Web Site: 
 http://www.cs.ubc.ca/~tmm/ Data Visualization
  48. 48. Today’s Goal
  49. 49. Story: ! I want to show gene expression changes over time as a network diagram
  50. 50. YPL201C YPL211W YML007WYPL131W YOR327CYDR171W YCL067C YGL208WYER074WYBL050W YLR134WYPL149W YDR050C YMR311CYGL134W YBR112CYKL101W YNL199C YPL222W YLR264W YNL098C YLL028W YOR039W YNL135C YPR041WYDR174W YIL074C YKL028W YIL162W YNL189W YOR212W YPR080W YPR145W YLL019C YLR284CYPL031C YFR037CYML074C YPL240CYPR048W YBR274W YBR050C YML032C YJR022WYBR248C YDR382W YER081WYIR009W YDR244W YOL016C YER103W YGR058WYLR256WYAL003W YOR355WYIL061C YER111C YMR309C YPL248CYBR019CYLR362W YGL035CYPR167C YML123C YBL026WYNL091W YOR178C YIL113WYLR321C YML064C YMR117C YDL194WYNR007C YOL058WYBR045CYER065CYNL167C YGL097WYHR071W YDL078C YDL081CYDR354W YER145C YGR136WYDR311W YPR119WYER112W YLR214W YER143W YBR043CYKL204W YGR019WYEL041W YER133W YBR118WYAL038W YDR167WYMR058WYER079W YMR291W YKL012W YDL113CYDR299W YDL075W YDL236WYLR377C YNL145W YNL236W YOL156W YGL013C YHR171W YMR021C YFL038C YER090WYPR062W YAR007C YNL307CYML024WYDR335W YLR075W YNL050CYGR046W YAL040CYLR191W YMR138WYIL045W YHR005C YKL211CYLR452C YPL075WYML051W YOL123WYHR198C YMR300C YJR060W YMR043WYPR124WYLR081W YLR319CYKL074C YKL001C YDR100W YDR395W YDR009W YDR309C YPR102C YAL030W YHR084W YLR345W YBR170C YJL089WYFL026W YBR018C YGL115W YDL215CYGR009C YOL120C YFL017C YDR429C YIL052C YGL073W YGR108WYPR035W YJL190CYOL086CYBL005WYKR026C YBR155W YOR264W YKL109W YOR167C YDR070CYEL015W YIL133C YGL166WYHR030CYGL008C YMR146C YBR160W YBR020W YBR190WYDR323CYLR197W YFR014CYKL161C YML054C YKR099WYLR340WYGL106W YBR093CYCL040W YLR044C YCR086WYDL130W YJL203W YEL009CYBR135W YOR361C YGR085C YNL216W YBR109C YER124C YJL157C YDR461WYNL154CYLR117C YKR097W YIL069CYMR186W YJR109CYIL015W YER040W YGR074WYER052C YIL160CYOR290C YLR249W YGL153WYOR215CYGR254W YLR432WYCR084CYOR089C YOR303W YGL161C YLR293CYDL030WYNL036W YHR135CYER179W YDR277CYDR184C YML114C YFL039CYER054C YER110CYLR109W YLR116WYNL214W YBL069W YHR141CYER116CYJL219W YDL023C YGL202WYER062C YMR183CYFR034CYGL122C YIL105C YDL088CYPR010C YJR048W YIL070C YEL039CYDR412WYMR108W YOR204W YMR255W YLR175W YHR115CYNL164C YJL013C YDL063C YNL117W YIL143CYOR315W YDR146CYLR310CYGR014WYBR217W YJL036W YNL116W YOR120W YDR032C YPR113W YLR153C YGR048W YGR203W YNL113WYOR202W YNR050C YCL030C YJL159W YHR053CYPR110C?YLR258W YBL079W YNL069C YNL311CYDR142C YGL044CYMR044W What is Great Visualization…?
  51. 51. Design is complicated, because humans are complicated. Design is a process to avoid bad designs. Mike Bostock (New York Times Visualization Team. Creator of D3.js)
  52. 52. It is hard to generalize the design process, but we can avoid pitfalls by following some basic rules.
  53. 53. Avoid Chartjunk Edward Tufte http://en.wikipedia.org/wiki/File:Chartjunk-example.svg
  54. 54. Every pixel should carry information. Edward Tufte
  55. 55. Avoid Data Overload • Mapping too many attributes makes your visualization awful! • It is hard to see the overall trend of your data sets if too many channels are used in a image
  56. 56. “Great Artists Steal…”
  57. 57. MUD HAP4 GC HA GAL1 GAL7 GAL80 GAL3 GAL11 GAL4 GAL2 SIP4 FBP1 GAL10 SWI5 SUC2 MIG1 ADH1 PGK1 CDC19 GCR1 CBF1 ENO1 ENO2 MCK1 NCE103 SSL2 TFB1 YNL091W TRP4 ARG1 GCN4 SKO1 HIS3 ADE4 ILV2 RPS17A BAS1 HIS7 RPS24B MSL1 HIS4 PDC5 PHO84 PHO4 YIL105C MET16 RPL11B RPS8B RPL11A RPL31A PHO13 PDC1 SXM1 RPL34B RPL16B ATC1 CAR1 FCY1 ICL1SRP1 TPI1 RPL18B RPL25 PHO5 RPS24A RPL18A DMC1 RAP1 RPL16A HSP42
  58. 58. MUD HAP4 GC HA GAL1 GAL7 GAL80 GAL3 GAL11 GAL4 GAL2 SIP4 FBP1 GAL10 SWI5 SUC2 MIG1 ADH1 PGK1 CDC19 GCR1 CBF1 ENO1 ENO2 MCK1 NCE103 SSL2 TFB1 YNL091W TRP4 ARG1 GCN4 SKO1 HIS3 ADE4 ILV2 RPS17A BAS1 HIS7 RPS24B MSL1 HIS4 PDC5 PHO84 PHO4 YIL105C MET16 RPL11B RPS8B RPL11A RPL31A PHO13 PDC1 SXM1 RPL34B RPL16B ATC1 CAR1 FCY1 ICL1SRP1 TPI1 RPL18B RPL25 PHO5 RPS24A RPL18A DMC1 RAP1 RPL16A HSP42 Map gene expression values to color Avoid using more colors in other components (edge/label) If necessary, map other data into non-overlapping visual properties (edge score to width)
  59. 59. Part 1: Session File and Basic Navigation
  60. 60. Cytoscape 3.1 Desktop Toolbar Network Panel Bird’s Eve View Table Browser Network Views
  61. 61. Table Browser Local Column Table Tabs List Data
 (Values in [ ]) Shared Column
  62. 62. Session File - Snapshot of your workspace - Networks - Tables - Visual Styles - System Properties
  63. 63. Open a Session - Click folder icon - Or, File → Open
  64. 64. Exercise 1: Loading a session
  65. 65. Navigation - Pan: Middle-Click + Drag or 
 Command + Left-Click + Drag on Mac - Zoom - IN: Mouse Wheel UP - OUT: Mouse Wheel DOWN - Selection: Left-Click and Drag - Fit to Window - Selected region - Entire network
  66. 66. First Neighbor of Nodes CTR+6
  67. 67. Create New Sub-Network From Selection CTR+N
  68. 68. - CTR (Command on Mac) + G
  69. 69. Part 2: Data Import
  70. 70. Network Data Formats - SIF - GML - XGMML - GraphML - BioPAX - PSI-MI - SBML - KGML (KEGG) - Excel - Text Table - CSV - Tab
  71. 71. NCBI Gene ID 672 On Chromosome 17 GO Terms DNA Repair Cell Cycle DNA Binding Ensemble ID ENSG00000012048 BRCA1
  72. 72. Data Tables for Cytoscape - Example: - Numeric - Gene expression profiles - Network statistics calculated in other applications, such as R - Confidence scores for edges - Text (or categorical) - GO annotation for genes - List of genes related to disease X - Targets for FDA approved drugs - Genes on KEGG Pathway Y - Clusters / group / community calculated in external programs - …
  73. 73. Your Data Sets - Anything saved as a table can be loaded into Cytoscape - Excel - Tab Delimited Document - CSV - As long as proper mapping key is available, Cytoscape can map them to your networks.
  74. 74. Mapping Key in the Network Mapping Key in the Table
  75. 75. Exercise 2: Loading network and tables
  76. 76. Part 3: Visualization
  77. 77. Layouts
  78. 78. Automatic Layout - Choose proper algorithm - Tree-like data - Hierarchical Layout - Scale-Free Network - Force-directed - Circular process - Circular Layout - Tweak parameters if necessary
  79. 79. Manual Layout - Tweak result from automatic layout - Scale - Align - Rotate
  80. 80. Exercise 3: Apply layouts
  81. 81. Visual Style - Collection of mappings from Attributes to Visual Properties
  82. 82. Visual Styles - Defaults + Mappings - Expression values to node color - Gene function to node shape - Interaction detection method to edge line type - Confidence score to edge width
  83. 83. Core Idea: Data Controls The View
  84. 84. Data Controls The View • Photoshop / Illustrator • You control the pixels and objects on the display • Data Visualization Tools (including Cytoscape) • Data points are mapped to visual properties • Color • Size
  85. 85. Data Controls The View
  86. 86. Expression Values To Node Colors
  87. 87. Discrete Mapping Editor Continuous Mapping Editor
  88. 88. Exercise 4: Create New Visual Style
  89. 89. Part 4: Web Services (Optional)
  90. 90. Cytoscape Ecosystem
  91. 91. Dawn of Web-Based Visualization
  92. 92. Cytoscape Family - cytoscape.js: Library for web applications JS
  93. 93. Cytoscape 3.1.0
  94. 94. JS
  95. 95. JS
  96. 96. Cytoscape.js Network Visualization Library Running on Web Browsers
  97. 97. What is cytoscape.js? A Javascript Library for network visualization, not a web application! Need to write some code to use it on the web browsers…
  98. 98. Complete desktop application for network analysis and visualization ! Written in Java ! Expandable by Apps ! For Users A Javascript Library for network visualization, not a web application! ! Written in JavaScript ! Expandable by Extensions ! For Developers JS
  99. 99. Analysis Data Integration Cytoscape Desktop Cytoscape.js Visualization Minimal Analysis Cytoscape Web Desktop Layout Visual Style Visual Style Layout Visualization
  100. 100. Integration to Cytoscape New in Cytoscape 3.1.0: Export Networks and Visual Styles to Cytoscape.js Format JS
  101. 101. Future
  102. 102. Cytoscape Cyberinfrastructure Internet Service 1 Service 2 NDEx (DB) Web Browser Cytoscape Desktop
  103. 103. - - Two Google Groups - cytoscape- discuss@googlegroups.com - cytoscape- helpdesk@googlegroups.com - ANY question is OK! Getting Help
  104. 104. Further Readings
  105. 105. Further Readings • My presentation slides • http://www.slideshare.net/keiono • (This deck of slides will be uploaded tonight)
  106. 106. Further Readings 1 - Introduction to Network Biology - Deciphering Protein–Protein Interactions. Part I. Experimental Techniques and Databases
 
 Shoemaker BA, Panchenko AR (2007) Deciphering Protein–Protein Interactions. Part I. Experimental Techniques and Databases. PLoS Comput Biol 3(3): e42.doi:10.1371/journal.pcbi.0030042 - Deciphering Protein–Protein Interactions. Part II. Computational Methods to Predict Protein and Domain Interaction Partners
 
 Shoemaker BA, Panchenko AR (2007) Deciphering Protein–Protein Interactions. Part II. Computational Methods to Predict Protein and Domain Interaction Partners. PLoS Comput Biol 3(4): e43. doi:10.1371/ journal.pcbi.0030043
  107. 107. Further Readings 2 - Overview of Cytoscape Apps (Plugins) - A travel guide to Cytoscape plugins
 
 Rintaro Saito, Michael E Smoot, Keiichiro Ono, Johannes Ruscheinski, Peng-Liang Wang, Samad Lotia, Alexander R Pico, Gary D Bader, Trey Ideker (2012) Nature Methods 9 (11) p. 1069-1076 - Sample Protocol (based on 2.x) − Integration of biological networks and gene expression data using Cytoscape
 
 Cline, et al. Nature Protocols, 2, 2366-2382 (2007).
  108. 108. Further Readings 3 - Cytoscape Tutorial Booklet:
 
 Analysis and Visualization of Biological Networks with Cytoscape - http://www.rbvi.ucsf.edu/Outreach/Workshops/ISMBTutorial.pdf !
  109. 109. 2014 Keiichiro Ono kono@ucsd.edu

×