Your SlideShare is downloading. ×
Visual Analytics in Omics - why, what, how?
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Visual Analytics in Omics - why, what, how?

874
views

Published on

Published in: Education, Technology

5 Comments
3 Likes
Statistics
Notes
  • slide 19 to 28 were probably the most useful. Serving as a reminder on what we should keep in mind while making a visualisation. For the rest it was very interesting to see how data visualisation helps making hypothesises/validating in an expert environment :)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Beside the 'accuracy of quantitative perceptual tasks', we believe 'perceptual scalability' is also a key take-away message. Or being more detailed: the change of the sentence 'overview first, then zoom and filter, details on demand' to 'analyze first, show results, then zoom and filter, details on demand'. Maybe this key sentence isn't that highly applicable to the Multimedia course now, due to the limited sizes of the datasets, but it will be useful for all of us in the future. Then we will be dealing with much larger data and we can apply some rules from the slides.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • indeed, as my colleagues stated, We/I liked that fact that we saw some things again. We were immediately discussing how well we used certain principles! The examples were really interesting, although not really relevant to what we are doing, as we only have a small amount of data (football results).

    Pieterjan (topija.wordpress.com)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • I agree with Adriaan: we saw the slide on 'accuracy of perceptual tasks' (slide 19) at the beginning of the course. Now, after creating our own visualizations, it is very interesting to retrospect on this. Also, i think this presentation really complements what we've seen so far in the mume course, since it focuses more on visualizations for domain experts. Related to that, i really liked the multitude of examples of visualizations: each of them clearly made the task of reasoning about (often complex) data a lot more intuitive.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • With regard to the #mume13 course at the K.U.Leuven, where you gave this presentation last week, it was probably most useful to be reminded about the 'accuracy of quantitative perceptual tasks' (slides 19 and following). (I believe most members of the audience took this as the key take-away message.)
    What was extra interesting for our project specifically was the notes on computational scalability. Some of these aspects, although not nearly as extreme as when working with big data (I can imagine), are already of concern to us: various data sources that we need to 'unify', the interactivity suffering from computational lag, ... and some of the solution methods that were suggested on slide 60.
    Either way I found it a very interesting presentation, brought in an enthusiastic and engaging fashion. Thank you!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
874
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
13
Comments
5
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Visual Analytics in omics - why, what, how? Prof Jan Aerts STADIUS - ESAT, Faculty of Engineering, University of Leuven, Belgium Data Visualization Lab ! jan.aerts@esat.kuleuven.be jan@datavislab.org creativecommons.org/licenses/by-nc/3.0/
  • 2. • What problem are we trying to solve? • What is Visual Analytics and how can it help? • How do we actually do this? • Some examples • Challenges 2
  • 3. A. What’s the problem? 3
  • 4. hypothesis-driven -> data-driven Scientific Research Paradigms (Jim Gray, Microsoft) ! 1st 1,000s years ago empirical ! 2nd 100s years ago theoretical ! 3rd last few decades computational 4rd today data exploration ! I have an hypothesis -> need to generate data to (dis)prove it.
 I have data -> need to find hypotheses that I can test. 4
  • 5. What does this mean? • immense re-use of existing datasets • biologically interesting signals may be too poorly understood to be analyzed in automated fashion • much of initial analysis is exploratory in nature => what’s my hypothesis?
 => searching for unknown unknowns • automated algorithms often act as black boxes => biologists must have blind faith in bioinformatician (and bioinformatician in his/her own skills) 5
  • 6. For domain expert: what’s my hypothesis? Martin Krzywinski 7
  • 7. For developer and domain expert:
 opening the black box input filter 1 filter 2 filter 3 output A output B output C 8
  • 8. B. What is Visual Analytics and how can it help? 9
  • 9. Our research interest: visual design + interaction design + backend 10
  • 10. What is visualization? visualization of simulations in situ visualization
 of real-world structures 11
  • 11. What is visualization? T. Munzner 12
  • 12. What is visualization? cognition <=> perception cognitive task => perceptive task T. Munzner 13
  • 13. Why do we visualize data? • record information • blueprints, photographs,
 seismographs, ... • analyze data to support reasoning • develop & assess hypotheses • discover errors in data • expand memory • find patterns (see Snow’s cholera map) • communicate information • share & persuade • collaborate & revise 14
  • 14. Sedlmair et al. IEEE Transactions on Visualization and Computer Graphics. 2012
  • 15. The strength of visualization
  • 16. pictorial superiority effect “information” 72hr “informa” 65% “i” 10% 17
  • 17. Steven’s psychophysical law = proposed relationship between the magnitude of a physical stimulus and its perceived intensity or strength 18
  • 18. Accuracy of quantitative perceptual tasks how much (quantitative) what/where (qualitative) McKinlay 19
  • 19. Accuracy of quantitative perceptual tasks how much (quantitative) what/where (qualitative) McKinlay 20
  • 20. Accuracy of quantitative perceptual tasks how much (quantitative) what/where (qualitative) “power of the plane” McKinlay 21
  • 21. Pre-attentive vision = ability of low-level human visual system to rapidly identify certain basic visual properties • some features “pop out” • used for: • target detection • boundary detection • counting/estimation • ... • visual system takes over => all cognitive power available for interpreting the figure, rather than needing part of it for processing the figure 22
  • 22. 23
  • 23. 24
  • 24. Limitations of preattentive vision 1. Combining pre-attentive features does not always work => would need to resort to “serial search” (most channel pairs; all channel triplets) e.g. is there a red square in this picture 2. Speed depends on which channel (use one that is good for categorical) 25
  • 25. Gestalt laws - interplay between parts and the whole 26
  • 26. Gestalt laws - interplay between parts and the whole • simplicity • familiarity • proximity • symmetry • similarity • connectedness • good continuation • common fate 27
  • 27. Bret Victor - Ladder of abstration 28
  • 28. For domain expert: what’s my hypothesis? Martin Krzywinski 29
  • 29. Martin Krzywinski 30
  • 30. Martin Krzywinski 31
  • 31. For developer and domain expert:
 opening the black box input filter 1 filter 2 filter 3 output A output B output C 32
  • 32. B A C 33
  • 33. B A C 34
  • 34. B A C 35
  • 35. C. How do we actually do this? 36
  • 36. Talking to domain experts 37
  • 37. Data visualization framework 38
  • 38. Card sorting 39
  • 39. Tools of the trade 40
  • 40. Processing - http://processing.org • java 41
  • 41. D3 - http://d3js.org/ • javascript 42
  • 42. Vega - https://github.com/trifacta/vega/wiki • html + json 43
  • 43. D. Examples Data exploration Data filtering User-guided analysis 44
  • 44. Data exploration HiTSee Bertini E et al. IEEE Symposium on Biological Data Visualization (2011)
  • 45. Aracari Bartlett C et al. BMC Bioinformatics (2012) Ryo Sakai 46
  • 46. Reveal Jäger, G et al. Bioinformatics (2012)
  • 47. Meander Pavlopoulos et al. Nucl Acids Res (2013) Georgios Pavlopoulos 48
  • 48. ParCoord Endeavour gene prioritization Boogaerts T et al. IEEE International Conference on Bioinformatics & Bioengineering (2012) Thomas Boogaerts 49
  • 49. Sequence logo
  • 50. Seagull
  • 51. subgroup similarity difference
  • 52. Data filtering (visual parameter setting) TrioVis Sakai R et al. Bioinformatics (2013) Ryo Sakai 54
  • 53. User-guided analysis clustering regions of interest Spark Nielsen et al. Genome Research (2012) data samples chromatin modification DNA methylation RNA-Seq 55
  • 54. BaobabView decision trees van den Elzen S & van Wijk J. IEEE Conference on Visual Analytics Science and Technology (2011)
  • 55. E. Challenges 57
  • 56. Many challenges remain • scalability (data processing + perception), uncertainty, “interestingness”, interaction, evaluation • infrastructure & architecture • fast imprecise answers with progressive refinement • incremental re-computation • steering computation towards data regions of interest 58
  • 57. Computational scalability • speed • preprocessing big data: mapreduce = batch • interactivity: max 0.3 sec lag! • size • multiple data resolutions => data size increase • not all resolutions necessary for all data regions: steer computation to regions of interest
  • 58. • Options: • distribute visualization calculations over cluster • distributing scala/spark or other “real-time” mapreduce paradigm • functional programming paradigm? • lazy evaluation and smart preprocessing: only calculate what’s needed => generic framework
  • 59. Perceptual scalability • “overview first, then zoom and filter, details on demand”: breaks down with very big datasets • “analyze first, show results, then zoom and filter, details on demand” => need to identify regions of interest and “interestingness features” • identify higher-level structure in data (e.g. clustering, dimensionality reduction) -> use these to guide user
  • 60. Thank you • Georgios Pavlopoulos • Ryo Sakai • Thomas Boogaerts • Toni Verbeiren • Data Visualization Lab (datavislab.org) • Erik Duval • Andrew Vande Moere 62