Visual Analytics talk at ISMB2013

611 views

Published on

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
611
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
20
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Visual Analytics talk at ISMB2013

  1. 1. - Visual Analytics - The human back in the loop Jan Aerts Biodata Analysis and Visualization Stadius Group, ESAT Leuven University, Belgium jan.aerts@esat.kuleuven.be @jandot http://orcid.org/0000-0002-6416-2717
  2. 2. hypothesis-driven -> data-driven Scientific Research Paradigms (Jim Gray, Microsoft) I have an hypothesis -> need to generate data to (dis)prove it. I have data -> need to find hypotheses that I can test. 1st 1,000s years ago empirical 2nd 100s years ago theoretical 3rd last few decades computational 4rd today data exploration
  3. 3. What does this mean? • immense re-use of existing datasets • much of initial analysis is exploratory in nature • biologically interesting signals may be too poorly understood to be analyzed in automated fashion • visualization is very effective in facilitating human reasoning about complex data • automated algorithms often act as black boxes => biologists must have blind faith in bioinformatician (and bioinformatician in his/her own skills)
  4. 4. What is visualization? T. Munzner
  5. 5. Data visualization framework
  6. 6. Data visualization framework interactivity
  7. 7. Data visualization framework
  8. 8. Data visualization framework visual analytics infographics
  9. 9. “visual analytics”
  10. 10. • Types of interaction (Yi et al, IEEE Transactions on Visualization and Computer Graphics, 2007) • select -> mark something as interesting • explore -> show me something else • reconfigure -> show me a different arrangement • encode -> show me a different representation • abstract/elaborate -> show me less/more detail • filter -> show me something conditionally • connect -> show me connected items
  11. 11. Visualization for biological hypothesis generation • example: eQTL data (IEEE BioVis visualization challenge 2011) • 500 patients (affected + non-affected) • 7500 SNPs; gene expression data for 15 genes • PLINK one-locus/two-locus
  12. 12. Aracari Ryo Sakai Bartlett C et al. BMC Bioinformatics (2012)
  13. 13. Reveal Jäger, G et al. Bioinformatics (2012)
  14. 14. HiTSee Bertini E et al. IEEE Symposium on Biological Data Visualization (2011)
  15. 15. when do I know that my algorithm is “correct”? -> peek into the black box input filter 1 filter 2 output A filter 3 output B output C Visualization for algorithm development
  16. 16. A B C
  17. 17. A B C
  18. 18. A B C
  19. 19. Caleydo MatchMaker Lex A et al. IEEE Transactions on Visualization and Computer Graphics (2010)
  20. 20. Meander Pavlopoulos et al. Nucl Acids Res (2013) Georgios Pavlopoulos
  21. 21. ParCoord Boogaerts T et al. IEEE International Conference on Bioinformatics & Bioengineering (2012) Thomas Boogaerts Endeavour gene prioritization
  22. 22. Visualization for (live) interaction with analysis • alternating between visual and automatic methods -> continuous refinement and verification of preliminary results • misleading results: discovered at early stage • leverage user’s (biologist’s) insights • no black box
  23. 23. Cytoscape Smoot et al. Bioinformatics (2011)
  24. 24. Data filtering (visual parameter setting) TrioVis Ryo Sakai Sakai R et al. Bioinformatics (2013)
  25. 25. User-guided analysis Spark Nielsen et al. Genome Research (2012) clustering chromatin modification DNA methylation RNA-Seq data samples regions of interest
  26. 26. BaobabView van den Elzen S & van Wijk J. IEEE Conference on Visual Analytics Science and Technology (2011)decision trees
  27. 27. Goecks, J. et al. Nature Biotechnology (2012) Galaxy Trackster Goecks J et al. Nature Biotechnology (2012)
  28. 28. Bret Victor - Ladder of abstration
  29. 29. Many challenges remain • scalability (data processing + perception), uncertainty, “interestingness”, interaction, evaluation • infrastructure & architecture • fast imprecise answers with progressive refinement • incremental re-computation • steering computation towards data regions of interest
  30. 30. Acknowledgments • Bioinformatics Group at Stadius, Leuven University • in particular: Ryo Sakai, Georgios Pavlopoulos • visualization community for examples • Jeremy for Trackster video

×