Humanizing bioinformatics

4,356 views

Published on

In this talk, I explain the need for basic visualization know-how in bioinformatics.

Published in: Design, Technology, Education

Humanizing bioinformatics

  1. 1. Humanizing bioinformaticsJan AertsAssistant Professor - ESAT/SCDBioData Analysis & VisualizationFaculty of EngineeringLeuven Universityjan.aerts@esat.kuleuven.be@jandot
  2. 2. whoami Leuven
  3. 3. whoami Wageningen
  4. 4. whoami Roslin
  5. 5. whoami Hinxton
  6. 6. whoami Leuven
  7. 7. why “humanizing bioinformatics”?
  8. 8. bout lk a ta at I’llwh scientific research paradigms - big & complex data - what about the user? - data visualization
  9. 9. scientific research throughout time
  10. 10. Science Paradigms 1st 1,000s years ago empirical 2nd 100s years ago theoretical 3rd last few decades computational 4rd today data exploration Jim Gray
  11. 11. Science Paradigms 1st 1,000s years ago empirical 2nd 100s years ago theoretical 3rd computational biology last few decades computational 4rd today bioinformatics data exploration Jim Gray
  12. 12. ever bigger datasetsever more complicated mining algorithms
  13. 13. case in point:genome sequencing
  14. 14. why do we sequence?
  15. 15. transcriptionally active sitesprotein-DNA interactions alternative splicing gene expression variation discoverycopy number variation miRNA expression & discovery
  16. 16. single nucleotide polymorphismscoverage reads polymorphisms gene model
  17. 17. structural variation Robberecht et al, 2010 Molecular Biology of the Cell, 4th Edition
  18. 18. Human Genome Project
  19. 19. automate, automate, automate
  20. 20. HGP:15 years, $3 billion, tens of labs => 1 genome now: 1 week, $5000, 1 technician => 1 genome
  21. 21. genome sequencing throughput Mardis, 2010
  22. 22. genome sequencing throughput“next-generation” sequencing platforms Mardis, 2010
  23. 23. NHGRI
  24. 24. Metzker et al, 2010
  25. 25. big throughput => big data
  26. 26. advanced data structures
  27. 27. advanced data miningsupport vector machine recursive feature e limination n ifold le arning ma adaptive cascade shar ing trees
  28. 28. “Dammit Jim, I’m a doctor, not a bioinformatician!” Christophe Lambert
  29. 29. “Dammit Jim, I’m a doctor, not a bioinformatician!” We’re alienating the user... too much data blind trust (?) in bioinformatician
  30. 30. but... what’s the question?what parameters should I use? can I trust this output? I can’t wrap my head around this...
  31. 31. what’s the question? 4th paradigm question -> hypothesis -> generate data
  32. 32. what’s the question? 4th paradigm question -> hypothesis -> generate data generate data -> see what we can do with it
  33. 33. Gene interaction data: “A regulates B”
  34. 34. what parameters should I use?
  35. 35. peak
  36. 36. but is this?
  37. 37. van de Wiel et al, 2010
  38. 38. T. Voet
  39. 39. can I trust this output? data filtering putative mutations filter 1 filter 2 filter 3 A B C different settings for filters
  40. 40. BA C
  41. 41. BA C State of the art: run many filter pipelines and take intersection
  42. 42. What we should have found... B A C
  43. 43. different algorithms forfinding the same thing
  44. 44. I can’t wrap my head around this... too much (?) info
  45. 45. treatment plan for cancer patientsheterogeneous datasets multiple abstraction levels multiple sources multiple formats patient/clinical data population/family data tissue samples MR/CT/X-ray pathways gene expression data collaborative data examination pathologist geneticist biologist
  46. 46. researcher is lost...
  47. 47. data visualization
  48. 48. “... the use of computer-supported, interactive, visual representations of data toamplify cognition” (S Card, J Mackinlay & B Schneiderman)“... computer-based visualization systems providing visual representations ofdatasets intended to help people carry out some task more effectively.” (TMunzner)
  49. 49. cognitive task => perceptive task
  50. 50. I II III IV x y x y x y x y 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.80 mean x = 9.0 variance x = 11.0 correlation x & y = 0.816n = 11 mean y = 7.5 variance y = 4.12 regression line: y = 3+0.5x
  51. 51. exploration explanation
  52. 52. exploration explanationpictorial superiority effect “information” 72hr “informa” “i” 65% 1%
  53. 53. exploration explanation J van Wijk
  54. 54. exploration explanation
  55. 55. some of the principles (taken from T Munzner) know your visual encodings power of the plane danger of deptheyes beat memory overview, zoom and filter, details on demand overview, zoom and filter, details on demand overview, zoom and filter, details on demand overview, zoom and filter, details on demand overview, zoom and filter, details on demand ...
  56. 56. visual encoding channelsposition on common scaleposition on unaligned scale 2D size 3D size Mackinlay
  57. 57. “power of the plane”position on common scaleposition on unaligned scale 2D size 3D size
  58. 58. examples of sub-optimal encoding
  59. 59. Florence Nightingale
  60. 60. Florence Nightingale
  61. 61. Don’t believe everything you see
  62. 62. networks... <sigh>
  63. 63. same network Martin Krzewinsky
  64. 64. different networks! Martin Krzewinsky
  65. 65. 3D, anyone?
  66. 66. 3D, anyone? occlusion interaction complexity perspective distortion text legibility
  67. 67. Gene interaction data: “A regulates B”
  68. 68. regulatorworkhorse manager
  69. 69. size of effect shown in graphic“lie factor” = size of effect in data
  70. 70. Humanizing bioinformatics
  71. 71. Humanizing bioinformatics there and back againput the user back in the loop!
  72. 72. Thank you
  73. 73. Acknowledgments• graphics creators• Tamara Munzner• Martin Krzewinski
  74. 74. Image attributions ... got lost ... If you find something that’s yours, let me know!

×