Visual Analytics in Omics: why, what, how?

652 views
498 views

Published on

Presentation given at VisBio workshop in Bergen, Norway.

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
652
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
17
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Visual Analytics in Omics: why, what, how?

  1. 1. Visual Analytics in omics - why, what, how? Prof Jan Aerts STADIUS - ESAT, Faculty of Engineering, University of Leuven, Belgium Data Visualization Lab jan.aerts@esat.kuleuven.be jan@datavislab.org creativecommons.org/licenses/by-nc/3.0/
  2. 2. • What problem are we trying to solve? • What is Visual Analytics and how can it help? • How do we actually do this? • Some examples • Challenges 2
  3. 3. A. So what’s the problem? 3
  4. 4. hypothesis-driven -> data-driven Scientific Research Paradigms (Jim Gray, Microsoft) I have an hypothesis -> need to generate data to (dis)prove it. I have data -> need to find hypotheses that I can test. 1st 1,000s years ago empirical 2nd 100s years ago theoretical 3rd last few decades computational 4rd today data exploration 4
  5. 5. What does this mean? • immense re-use of existing datasets • much of initial analysis is exploratory in nature => what’s my hypothesis? • biologically interesting signals may be too poorly understood to be analyzed in automated fashion • visualization is very effective in facilitating human reasoning about complex data • automated algorithms often act as black boxes => biologists must have blind faith in bioinformatician (and bioinformatician in his/her own skills) 5
  6. 6. input filter 1 filter 2 output A filter 3 output B output Opening the black box 6
  7. 7. A B C 7
  8. 8. A B C 8
  9. 9. A B C 9
  10. 10. What’s my hypothesis? 10 Martin Krzywinski
  11. 11. 11 Martin Krzywinski
  12. 12. 12 Martin Krzywinski
  13. 13. B. What is Visual Analytics and how can it help? 13
  14. 14. 14
  15. 15. What is visualization? T. Munzner 15
  16. 16. What is visualization? T. Munzner cognition <=> perception cognitive task => perceptive task 16
  17. 17. • record information • blueprints, photographs, seismographs, ... • analyze data to support reasoning • develop & assess hypotheses • discover errors in data • expand memory • find patterns (see Snow’s cholera map) • communicate information • share & persuade • collaborate & revise Why do we visualize data? 17
  18. 18. pictorial superiority effect “information” “informa” “i” 65% 1% 72hr 18
  19. 19. Steven’s psychophysical law = proposed relationship between the magnitude of a physical stimulus and its perceived intensity or strength 19
  20. 20. Accuracy of quantitative perceptual tasks McKinlay what/where (qualitative)how much (quantitative) 20
  21. 21. Accuracy of quantitative perceptual tasks McKinlay what/where (qualitative)how much (quantitative) 21
  22. 22. Accuracy of quantitative perceptual tasks McKinlay “power of the plane” what/where (qualitative)how much (quantitative) 22
  23. 23. Pre-attentive vision = ability of low-level human visual system to rapidly identify certain basic visual properties • some features “pop out” • used for: • target detection • boundary detection • counting/estimation • ... • visual system takes over => all cognitive power available for interpreting the figure, rather than needing part of it for processing the figure 23
  24. 24. 24
  25. 25. 25
  26. 26. 1. Combining pre-attentive features does not always work => would need to resort to “serial search” (most channel pairs; all channel triplets) e.g. is there a red square in this picture Limitations of preattentive vision 2. Speed depends on which channel (use one that is good for categorical; see further (“accuracy”)) 26
  27. 27. Gestalt laws - interplay between parts and the whole 27
  28. 28. Gestalt laws - interplay between parts and the whole • simplicity • proximity • similarity • connectedness • good continuation • common fate • familiarity • symmetry 28
  29. 29. Context affects perceptual tasks
  30. 30. C. How do we actually do this? 30
  31. 31. Talking to domain experts 31
  32. 32. Data visualization framework 32
  33. 33. Card sorting 33
  34. 34. Tools of the trade 34
  35. 35. Processing - http://processing.org • java 35
  36. 36. D3 - http://d3js.org/ • javascript 36
  37. 37. Vega - https://github.com/trifacta/vega/wiki • html + json 37
  38. 38. To use vega • Create the json file • Create the index.html • Run “python -m SimpleHTTPServer” • Go to http://127.0.0.1:8000/index.html • Get help at https://github.com/trifacta/vega/wiki 38
  39. 39. D. Examples 39
  40. 40. HiTSee Bertini E et al. IEEE Symposium on Biological Data Visualization (2011) 40
  41. 41. Aracari Ryo Sakai Bartlett C et al. BMC Bioinformatics (2012) 41
  42. 42. Meander Pavlopoulos et al. Nucl Acids Res (2013) 42 Georgios Pavlopoulos
  43. 43. ParCoord Boogaerts T et al. IEEE International Conference on Bioinformatics & Bioengineering (2012) Thomas Boogaerts Endeavour gene prioritization 43
  44. 44. Data filtering (visual parameter setting) TrioVis Ryo Sakai Sakai R et al. Bioinformatics (2013) 44
  45. 45. User-guided analysis Spark Nielsen et al. Genome Research (2012) clustering chromatin modification DNA methylation RNA-Seq data samples regions of interest 45
  46. 46. Bret Victor - Ladder of abstration 46
  47. 47. E. Challenges 47
  48. 48. Many challenges remain • scalability (data processing + perception), uncertainty, “interestingness”, interaction, evaluation • infrastructure & architecture • fast imprecise answers with progressive refinement • incremental re-computation • steering computation towards data regions of interest 48
  49. 49. Thank you • Georgios Pavlopoulos • Ryo Sakai • Thomas Boogaerts • Data Visualization Lab (datavislab.org) • Erik Duval • Andrew Vande Moere 49

×