Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cabitza - Algorithms and Explanations@NYU Extended Version

279 views

Published on

A longer version of the slideware shown at the Algorithms and Explanations Workshop organized at the School of Law of the NYU, NYC, NY, USA, 27-28 April 2017.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Cabitza - Algorithms and Explanations@NYU Extended Version

  1. 1. Federico Cabitza, BSc, MEng, PhD Orthopedics Institute Galeazzi, Milano, Italy University of Milano-Bicocca, Milano, Italy
  2. 2. Federico Cabitza, BSc, MEng, PhD Orthopedics Institute Galeazzi, Milano, Italy University of Milano-Bicocca, Milano, Italy
  3. 3. © Katsuhiro Otomo what do we see when we look at machine learning applied to real life cases?
  4. 4. © Katsuhiro Otomo what do we see when we look at machine learning applied to real life cases? I see human yearning for certainty.
  5. 5. © Katsuhiro Otomo what do we see when we look at machine learning applied to real life cases? I see human yearning for certainty. Dominique El Bez Mar 23, 2017
  6. 6. © Katsuhiro Otomo what do we see when we look at machine learning applied to real life cases? I see human yearning for certainty. Nov 3, 2016
  7. 7. CREATIVE ANTINOMIES
  8. 8. ACCURACY VS. EXPLAINABILITY
  9. 9. […] the neural nets outperformed traditional methods such as logistic regression by wide margin (the neural net had AUC=0.86 compared to 0.77 for logistic regression) [...]. Although the neural nets were the most accurate models, after careful consideration they were considered too risky for use on real patients and logistic regression was used instead. Why? Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015; 1721-1730 ACCURACY VS. EXPLAINABILITY
  10. 10. DavidGunning(DARPA);AaronM.Bornstein(PrincetonNeuroscienceInstitute)
  11. 11. • Ribeiro, M. T., Singh, S., & Guestrin, C. (2016, August). Why Should I Trust You?: Explaining the Predictions of Any Classifier. In KDD 2016, the Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135-1144). ACM. * Caruana, R., et al. (2015). Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1721-1730). ACM. ?* DavidGunning(DARPA);AaronM.Bornstein(PrincetonNeuroscienceInstitute)
  12. 12. See also Lipton, Z. C. (2016). The mythos of model interpretability. arXiv preprint arXiv:1606.03490. ACCURACY VS. EXPLAINABILITY
  13. 13.  Interpretability (similar to plausibility and justifiability) is about humans understanding why. For this, you do not need to open the black box, but get the f.  Scrutability (aka transparency) regards humans seeing into the box, opening it: understanding what (is there inside).  Comprehensibility, ultimately, regards humans understanding how. See also Lipton, Z. C. (2016). The mythos of model interpretability. arXiv preprint arXiv:1606.03490. ACCURACY VS. EXPLAINABILITY
  14. 14. This is typical of the data scientist. It has a didactic valence and can strengthen the end-user’s self-confidence and autonomy, but probably it has low impact on (medical) decision and interpretation. ACCURACY VS. EXPLAINABILITY  Interpretability (similar to plausibility and justifiability) is about humans understanding why. For this, you do not need to open the black box, but get the f.  Scrutability (aka transparency) regards humans seeing into the box, opening it: understanding what (is there inside).  Comprehensibility, ultimately, regards humans understanding how.
  15. 15. This is obtained when the MD can look at the dimensions of model and at the discriminative “regions”. ACCURACY VS. EXPLAINABILITY  Interpretability (similar to plausibility and justifiability) is about humans understanding why. For this, you do not need to open the black box, but get the f.  Scrutability (aka transparency) regards humans seeing into the box, opening it: understanding what (is there inside).  Comprehensibility, ultimately, regards humans understanding how.
  16. 16. This is what most MDs want and need to justify their final decision.ACCURACY VS. EXPLAINABILITY  Interpretability (similar to plausibility and justifiability) is about humans understanding why. For this, you do not need to open the black box, but get the f.  Scrutability (aka transparency) regards humans seeing into the box, opening it: understanding what (is there inside).  Comprehensibility, ultimately, regards humans understanding how.
  17. 17. ACCURACY VS. EXPLAINABILITY  Interpretability (similar to plausibility and justifiability) is about humans understanding why. For this, you do not need to open the black box, but get the f.  Scrutability (aka transparency) regards humans seeing into the box, opening it: understanding what (is there inside).  Comprehensibility, ultimately, regards humans understanding how. However, does the decision of a human doctor always have to be interpreted by others?
  18. 18. * Citron, D., (2007) "Technological Due Process," Washington University Law Review, vol. 85 (2007): 1249-1313 ACCURACY VS. EXPLAINABILITY  Interpretability (similar to plausibility and justifiability) is about humans understanding why. For this, you do not need to open the black box, but get the f.  Scrutability (aka transparency) regards humans seeing into the box, opening it: understanding what (is there inside).  Comprehensibility, ultimately, regards humans understanding how. Not really! The Doctor has to be credible.The higher goal is the credibility of the hybrid agency (the socio-technical system), which includes the human and all of its fs.*
  19. 19. A step back…
  20. 20. Machine Learning Given a training set {(x1,t1), . . . ,(xn,tn)}, a learner produces a model f. Given a test example x, this model produces a prediction y = f(x). Let t be the true value of the predicted variable for the test example x.A loss function L(t, y) measures the cost of predicting y when the true value is t.The goal of learning can be stated as producing a model with the smallest possible loss; i.e., a model that minimizes the average L(t, y) over all xs, with each example weighted by its probability. Y = f(X) DEEP-SHALLOW | SYMBOLIC-SUBSYMBOLIC | GENERATIVE-DISCRIMINATIVE | SUPERVISED-UNSUPERVISED
  21. 21. Machine Learning Given a training set {(x1,t1), . . . ,(xn,tn)}, a learner produces a model f. Given a test example x, this model produces a prediction y = f(x). Let t be the true value of the predicted variable for the test example x.A loss function L(t, y) measures the cost of predicting y when the true value is t.The goal of learning can be stated as producing a model with the smallest possible loss; i.e., a model that minimizes the average L(t, y) over all xs, with each example weighted by its probability. Y = f(X) DEEP-SHALLOW | SYMBOLIC-SUBSYMBOLIC | GENERATIVE-DISCRIMINATIVE | SUPERVISED-UNSUPERVISED
  22. 22. Machine Learning Given a training set {(x1,t1), . . . ,(xn,tn)}, a learner produces a model f. Given a test example x, this model produces a prediction y = f(x). Let t be the true value of the predicted variable for the test example x.A loss function L(t, y) measures the cost of predicting y when the true value is t.The goal of learning can be stated as producing a model with the smallest possible loss; i.e., a model that minimizes the average L(t, y) over all xs, with each example weighted by its probability. Y = f(X) DEEP-SHALLOW | SYMBOLIC-SUBSYMBOLIC | GENERATIVE-DISCRIMINATIVE | SUPERVISED-UNSUPERVISED How much reliable are the Ys and the Xs?
  23. 23. Machine Learning Y = f(X)
  24. 24. Machine Learning Y = f(X)kappa= 0.56 kappa= 0.54
  25. 25. Y = f(X)kappa= 0.56 ± 0.11 Klaus Krippendorff (1932) Gregory Bateson professor for Cybernetics, Language, and Culture at the University of Pennsylvania, Philadelphia, USA. Krippendorff, K. (2004). Reliability in Content Analysis: Some Common Misconceptions and Recommendations. Human Communication Research, 30 (3), 411-433 kappa= 0.56 kappa= 0.54
  26. 26. Machine Learning Given a training set {(x1,t1), . . . ,(xn,tn)}, a learner produces a model f. Given a test example x, this model produces a prediction y = f(x). Let t be the true value of the predicted variable for the test example x.A loss function L(t, y) measures the cost of predicting y when the true value is t.The goal of learning can be stated as producing a model with the smallest possible loss; i.e., a model that minimizes the average L(t, y) over all xs, with each example weighted by its probability. Y = f(X) DEEP-SHALLOW | SYMBOLIC-SUBSYMBOLIC | GENERATIVE-DISCRIMINATIVE | SUPERVISED-UNSUPERVISED ML when the experts’ agreement on the right y is high is useless. ML when agreement on the y is low it can be even harmful!
  27. 27. DATA VS. PHENOMENA
  28. 28. Data are just a way to represent facts, that is to depict how we perceive and understand them, thus to stabilize what we deem more relevant and neglect the rest. DATA VS. PHENOMENA
  29. 29. Data are just a way to represent facts, that is to depict how we perceive and understand them, thus to stabilize what we deem more relevant and neglect the rest. There is subjectivity in how we collect data, how we shape them and how we project our desire of objectivity in them. In so doing, the biases of people are not filtered off, but rather ratified in a more or less surreptitious manner. DATA VS. PHENOMENA
  30. 30. Data set are usually given as either bit maps or bi-dimensional tables.
  31. 31. c d The columns represent features or dimensions; rows represent the cases.
  32. 32. t c d However, facts change over time and we “crystallize” them at an arbitrary place in time.
  33. 33. t c d Thus, we just datafy a small portion of the reality… Field of Experience
  34. 34. t Engineers wish the box to be full (dataset is complete), but completeness IS NOT reliability. E.g., PROMS:  Has the patient understood the item?  Has the patient been sincere?  Has her condition been sufficiently stable? c d
  35. 35. t c d Moreover: there are assumptions in how we datafy our facts. For instance:  missing at random,  negligible information bias (i.e., measurement, classification, selection, observer bias),  features are independent (low or no multi-correlability) and identically distributed (IID),  standard distributions.
  36. 36. t c d Bilgic M.: "in reality, the data is not missing at random. The fact that the cholesterol level is missing for a patient actually can be a very useful information: it could mean the test was not ordered on purpose, which could actually mean it is suspected to be either irrelevant for this task or it is assumed to be normal." Moreover: there are assumptions in how we datafy our facts. For instance:  missing at random,  negligible information bias (i.e., measurement, classification, selection, observer bias),  features are independent (low or no multi-correlability) and identically distributed (IID),  standard distributions.
  37. 37. t Moreover: there are assumptions in how we datafy our facts. For instance:  missing at random,  negligible information bias (i.e., measurement, classification, selection, observer bias),  features are independent (low or no multi-correlability) and identically distributed (IID),  standard distributions. c d Information bias encompasses observational-, measurement- and classification bias. It’s also about the problems of any overarching taxonomy*: temporal rigidity, dimensional paradox (either too narrow scope or too unwieldy), disciplinary partiality and classifier subjectivity. It can also regard considering as accurate data data that are intrinsically uncertain. The impact of this bias on ML training/test data sets has not yet been sufficiently considered. * Bowker, G., & Star, S. L. (1999). Sorting things out. Classification and its consequences.
  38. 38. Deming: “In God we Trust, all others bring”. OK, once we’ve got the data now, what can go wrong?
  39. 39. BIAS BY NEGLECT / MISSING VARIABLES CONFOUNDERS (LURKING VARIABLES & SIMPSON’S PARADOX) SELECTION/SAMPLE BIAS (MISSING CASES OR BIAS BY OMISSION, ALSO UNBALANCED TARGET) DATA LEAKAGE SURROGATE VARIABLES UNDER-SAMPLING BIAS
  40. 40. BIAS BY NEGLECT / MISSING VARIABLES CONFOUNDERS (LURKING VARIABLES & SIMPSON’S PARADOX) SELECTION/SAMPLE BIAS (MISSING CASES OR BIAS BY OMISSION, ALSO UNBALANCED TARGET) DATA LEAKAGE SURROGATE VARIABLES UNDER-SAMPLING BIAS “Computer interpretations of electrocardiograms recorded 1 minute apart were significantly (grossly) different in 4 of 10 cases.” Spodick, D. H., & Bishop, R. L. (1997). Computer treason: intraobserver variability of an electrocardiographic computer system. The American journal of cardiology, 80(1), 102-103.
  41. 41. BIAS BY NEGLECT / MISSING VARIABLES CONFOUNDERS (LURKING VARIABLES & SIMPSON’S PARADOX) SELECTION/SAMPLE BIAS (MISSING CASES OR BIAS BY OMISSION, ALSO UNBALANCED TARGET) DATA LEAKAGE SURROGATE VARIABLES UNDER-SAMPLING BIAS    Paxton et al: “Learning and evaluating early prediction systems using observational data alone en- counters three main challenges: 1) Incomplete Observation; 2) Selection Bias; 3) Confounding Medical Interventions (CMIs, which are interventions performed by the caregivers that will affect the risk of the outcome of interest)”. Paxton, C., Saria, S., & Niculescu-Mizil, A. (2013). Developing predictive models using electronic medical records: challenges and pitfalls. In AMIA.
  42. 42. BIAS BY NEGLECT / MISSING VARIABLES CONFOUNDERS (LURKING VARIABLES & SIMPSON’S PARADOX) SELECTION/SAMPLE BIAS (MISSING CASES OR BIAS BY OMISSION, ALSO UNBALANCED TARGET) DATA LEAKAGE OBSERVER VARIABILITY/ INFORMATION BIAS LABEL BIAS* SURROGATE VARIABLES UNDER-SAMPLING BIAS Frénay & Verleysen : “label noise consists of mislabeled instances [...] it is an important issue in classification, with many potential negative consequences.” * Frénay, B., & Verleysen, M. (2014). Classification in the presence of label noise: a survey. IEEE transactions on neural networks and learning systems, 25(5), 845-869.
  43. 43. BIAS BY NEGLECT / MISSING VARIABLES CONFOUNDERS (LURKING VARIABLES & SIMPSON’S PARADOX) SELECTION/SAMPLE BIAS (MISSING CASES OR BIAS BY OMISSION, ALSO UNBALANCED TARGET) DATA LEAKAGE SURROGATE VARIABLES UNDER-SAMPLING BIAS OBSERVER VARIABILITY/ INFORMATION BIAS LABEL BIAS* Label bias is not due to fallible raters or observers. It is intrinsic to some ambiguous phenomena, like most in medicine.
  44. 44. Epistemic sclerosis Y = f(X)
  45. 45. Epistemic sclerosis A cognitive bias by which people begin to think that the relationship btw X andY is true and stable, although likely it is neither. Y = f(X)
  46. 46. Epistemic sclerosis A cognitive bias by which people begin to think that the relationship btw X andY is true and stable. Y = f(X)
  47. 47. Epistemic sclerosis Y = f(X)
  48. 48. Beware that the explainability of ML algorithms can even reinforce this bias! Epistemic sclerosis
  49. 49. NO LEDGE VS. KNOWLEDGE
  50. 50. NO LEDGE VS. KNOWLEDGE ML allows for the automation of many tasks that in the past required knowledge: learning, trying, erring, practicing, knowledge sharing. Automation can lead users to become over- reliant on it and “to tend to over-accept computer output”* [so as to] “reduce vigilance in information seeking and processing”**. This phenomenon is called “automation bias” and its impact on human knowledge-related practices is still to be studied in case of ML aids over-performing human experts. * Goddard, K., Roudsari, A., & Wyatt, J. C. (2012). Automation bias: a systematic review of frequency, effect mediators, and mitigators. Journal of the American Medical Informatics Association, 19(1), 121-127. ** Lyell, D., & Coiera, E. (2016). Automation bias and verification complexity: a systematic review. Journal of the American Medical Informatics Association, ocw105.
  51. 51. AUTOMATION BIAS Automation bias occurs when human decisions are not based on a thorough analysis of all available information but are strongly biased by the automatically generated advice. • Parasuraman, R., & Manzey, D. H. (2010). Complacency and bias in human use of automation: An attentional integration. Human Factors: The Journal of the Human Factors and Ergonomics Society, 52(3), 381-410.
  52. 52. AUTOMATION BIAS Automation bias occurs when human decisions are not based on a thorough analysis of all available information but are strongly biased by the automatically generated advice. • Parasuraman, R., & Manzey, D. H. (2010). Complacency and bias in human use of automation: An attentional integration. Human Factors: The Journal of the Human Factors and Ergonomics Society, 52(3), 381-410. Mosier and Skitka (1996) defined automation bias as resulting from people’s using the outcome of the decision aid “as a heuristic replacement for vigilant information seeking and processing”
  53. 53. AUTOMATION BIAS Automation bias occurs when human decisions are not based on a thorough analysis of all available information but are strongly biased by the automatically generated advice. • Parasuraman, R., & Manzey, D. H. (2010). Complacency and bias in human use of automation: An attentional integration. Human Factors: The Journal of the Human Factors and Ergonomics Society, 52(3), 381-410. Mosier and Skitka (1996) defined automation bias as resulting from people’s using the outcome of the decision aid “as a heuristic replacement for vigilant information seeking and processing” “Automation bias occurs in both naive and expert participants, cannot be prevented by training or instructions, and can affect decision making in individuals as well as in teams.” *
  54. 54. AUTOMATION BIAS Errors of omission, when users do not respond to a critical situation, because they do not perceive it such (no alert) or because they do not know what to do any longer. Errors of commission, when users follow specific recommendations or directives provided by an aid even if it goes against available facts (cf. overdiagnosis).
  55. 55. AUTOMATION BIAS Possible causes (from Parasuraman & Manzey, 2010): Humans are “cognitive misers” (Wickens & Hollands, 2000). Perceived trust of humans in automated aids as powerful agents with superior analysis capability (Lee & See, 2004) “social loafing”: humans consider aids as members of their group and reduce their own effort because made partly redundant on a given task (Karau & Williams, 1993).
  56. 56. from a Paul Stevenson picture (https://goo.gl/o79xnF) How to move forward?
  57. 57. from a Paul Stevenson picture (https://goo.gl/o79xnF) How to move forward? Different validation More Interaction
  58. 58. Effective use of health technology depends on knowing its limits.* John Mandrola, MD Mandrola J., (2017, 03/29) Fake Atrial Fibrillation — A Growing Patient-Safety Issue. URL: http://www.drjohnm.org/2017/03/fake-atrial-fibrillation-a-growing-patient-safety-issue/. Last Accessed 04/03/2017. Archived at: http://archive.is/sGMtO Different validation
  59. 59. IN LABORATORIO VS. IN LABORE
  60. 60. ML models are always evaluated in terms of error rates or performance figures, i.e., in the virtual domain of symbolic computation. But not all mistakes are equally wrong. IN LABORATORIO VS. IN LABORE
  61. 61. ML models are always evaluated in terms of error rates or performance figures, i.e., in the virtual domain of symbolic computation. But not all mistakes are equally wrong. From: https://en.wikipedia.org/wiki/Receiver_operating_characteristic False Positive Rate Recall,TruePositiveRate IN LABORATORIO VS. IN LABORE
  62. 62. ML models are always evaluated in terms of error rates or performance figures, i.e., in the virtual domain of symbolic computation. But not all mistakes are equally wrong. From: https://en.wikipedia.org/wiki/Receiver_operating_characteristic AVOIDING FALSE NEGATIVES AVOIDINGFALSEPOSITIVES IN LABORATORIO VS. IN LABORE
  63. 63. ML models are always evaluated in terms of error rates or performance figures, i.e., in the virtual domain of symbolic computation. But not all mistakes are equally wrong. From: https://en.wikipedia.org/wiki/Receiver_operating_characteristic AVOIDING FALSE NEGATIVES AVOIDINGFALSEPOSITIVES IN LABORATORIO VS. IN LABORE
  64. 64. ML models are always evaluated in terms of error rates or performance figures, i.e., in the virtual domain of symbolic computation. But not all mistakes are equally wrong. From: https://en.wikipedia.org/wiki/Syndrome_Without_A_Name AVOIDING FALSE NEGATIVES AVOIDINGFALSEPOSITIVES IN LABORATORIO VS. IN LABORE NOT TO MENTION REAL BORDER-LINE CASES, AND SWANS (SYNDROMES WITHOUT A NAME). WHAT’S THE BEST COMPROMISE IN THESE CASES?
  65. 65. AI VS. IA
  66. 66. To our knowledge, all articles compare ML classifiers and human diagnosticians. Have you ever read a paper where the comparison regards accuracy of the human physicians with and without the ML aid? AI VS. IA
  67. 67. To our knowledge, all articles compare ML classifiers and human diagnosticians. Have you ever read a paper where the comparison regards accuracy of the human physicians with and without the ML aid? The collective imaginary regards Artificial Intelligence (AI) against us, it’s not human Intelligence Augmentation (IA) thanks to the machines. AI VS. IA
  68. 68. To our knowledge, all articles compare ML classifiers and human diagnosticians. Have you ever read a paper where the comparison regards accuracy of the human physicians with and without the ML aid? The collective imaginary regards Artificial Intelligence (AI) against us, it’s not human Intelligence Augmentation (IA) thanks to the machines. AI VS. IA Toh, M. (2017, 03/02) Google and IBM: We Want Artificial Intelligence to Help You, Not Replace You. Fortune. URL: http://fortune.com/2017/03/02/google-ibm-artificial-intelligence/. Last Accessed on 04/02/2017. Archived at: http://archive.is/FLoMO
  69. 69. To our knowledge, all articles compare ML classifiers and human diagnosticians. Have you ever read a paper where the comparison regards accuracy of the human physicians with and without the ML aid? The collective imaginary regards Artificial Intelligence (AI) against us, it’s not human Intelligence Augmentation (IA) thanks to the machines. AI VS. IA IBM advertisement, december 1951 THE AMBIGUITY OF THE TERM “EXTRA”
  70. 70. CYBORG VS. CYBORK However, we are not envisioning a new medical centaur, half human and half computational. We rather believe that automation must be limited at automating productive work, while humans must be left free to enrich their mutual relationships (e.g. sentimental work*). It’s not a new human to come (cyborg) but rather a new organization of human work (cybork). * Strauss, A., Fagerhaugh, S., Suczek, B., & Wiener, C. (1982). Sentimental work in the technologized hospital. Sociology of health & illness, 4(3), 254-278.
  71. 71. More Interaction John Mandrola, MD Almost anyone can implant a pacemaker or do an ablation; the hard part of medicine is to align the right treatment with the goals of the person*. * Mandrola J., (2017, 03/29) Fake Atrial Fibrillation — A Growing Patient-Safety Issue. URL: http://www.drjohnm.org/2017/03/fake-atrial-fibrillation-a-growing-patient-safety-issue/. Last Accessed 04/03/2017. Archived at: http://archive.is/sGMtO
  72. 72. DECISION VS. CHOICE ML models are always evaluated in terms of error rates or performance figures, i.e., in the virtual domain of symbolic computation. ML models applied to medicine should be evaluated in terms of value, i.e., ratio btw benefits (cost savings, throughput, clinical outcome meaures) and costs.
  73. 73. DECISION VS. CHOICE In converting dichotomous (or anyway discrete) decisions into situated action, we make choices.
  74. 74. DECISION VS. CHOICE In converting dichotomous (or anyway discrete) decisions into situated action, we make choices. In so doing, MDs must cope with patients’ fears, preferences, expectations as well as with other policy- and context-related constraints (e.g., warfarin: what side effect we dread most bwt stroke and haemorrhage?).
  75. 75. DECISION VS. CHOICE In converting dichotomous (or anyway discrete) decisions into situated action, we make choices. In so doing, MDs must cope with patients’ fears, preferences, expectations as well as with other policy- and context-related constraints (e.g., warfarin: what side effect we dread most bwt stroke and haemorrhage?). The need to recognize the intrinsic uncertainty of human existence, and the work unfolding around its events, like medicine*. To this respect, medicine should be still seen as the art of choice, rather than a science of the decision. * Simpkin, A. L., & Schwartzstein, R. M. (2016). Tolerating uncertainty—the next medical revolution?. New England Journal of Medicine, 375(18), 1713-1715.
  76. 76. DECISION VS. CHOICE In converting dichotomous (or anyway discrete) decisions into situated action, we make choices. In so doing, MDs must cope with patients’ fears, preferences, expectations as well as with other policy- and context-related constraints (e.g., warfarin: what side effect we dread most bwt stroke and haemorrhage?). The need to recognize the intrinsic uncertainty of human existence, and the work unfolding around its events, like medicine*. To this respect, medicine should be still seen as the art of choice, rather than a science of the decision. Here’s the importance of looking at the all-piece interaction and cultivate a prudent socio- technical stance among all the stakeholders.
  77. 77. POWER IS NOTHING WITHOUT CONTROL.
  78. 78. POWER IS NOTHING WITHOUT CONTROL.ACCURACY INTERACTION
  79. 79. POWER IS NOTHING WITHOUT CONTROL.ACCURACY INTERACTION INTERACTIVE MACHINE LEARNING machine learning with a human in the learning loop, observing the result of learning and providing input meant to improve the learning outcome. Brad Knox, http://iml.media.mit.edu/ (see also Holzinger 2016)
  80. 80. POWER IS NOTHING WITHOUT CONTROL.ACCURACY INTERACTION INTERACTIVE MACHINE LEARNING More interaction with the learner (data scientists often refer to feature engineering) See al so: Holzinger, A. (2016). Interaactive machine learning for health informatics: when do we need the human-in-the-loop?. Brain Informatics, 3(2), 119-131.
  81. 81. INTERACTIVE MACHINE LEARNING More interaction with the learner More interaction with the model The need to let the medical specialist tweak the model and see how it represents the data, how it performs in predictive tasks and how it can evolve. See also: Sacha, D., Sedlmair, M., Zhang, L., Lee, J. A., Weiskopf, D., North, S., & Keim, D. (2016, August). Human-centered machine learning through interactive visualization. ESANN.
  82. 82. INTERACTIVE MACHINE LEARNING More interaction with the learner More interaction with the model More interaction with the people The need to open up the “loop” : the Machine Learning Decision Support System as a socio-technical system where radiologists, referring doctors, data scientists, and engineers interact to make sense of predictions and yield value into medical practice.
  83. 83. Machine learning as a technology upon data. But what did Plato write on the technology of data, 2387 years ago?
  84. 84. Machine learning as a technology upon data. But what did Plato write on the technology of data, 2387 years ago? This isTheuth, ancient Egyptian god of (also) logismòs (λογισμός), i.e., the art of computing and reasoning by numbers and symbols (grammata).,
  85. 85. because they will not practice their memory. Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them. You have invented an elixir not of memory, but of reminding; and you offer your pupils the appearance of competence, not true competence, for they will read many things without instruction and will therefore seem to know many things, when they are for the most part ignorant and hard to get along with, since they are not expert, but only appear to be expert. Plato Phaedrus 274e-275b “This invention, O king,” said Theuth, “will make the Egyptians wiser and will improve their memories; for it is an elixir of memory and competence that I have discovered.” But Thamus replied, “Most ingenious Theuth, one man has the ability to beget arts, but the ability to judge of their usefulness or harmfulness to their users belongs to another; and now you, who are the father of letters, have been led by your affection to ascribe to them a power the opposite of that which they really possess. For this invention will produce forgetfulness in the minds of those who learn to use it,
  86. 86. because they will not practice their memory. Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them. You have invented an elixir not of memory, but of reminding; and you offer your pupils the appearance of competence, not true competence, for they will read many things without instruction and will therefore seem to know many things, when they are for the most part ignorant and hard to get along with, since they are not expert, but only appear to be expert.δοξόσοφοι ἀντὶ σοφῶν βλάβης τε καὶ ὠφελίας μνήμης τε γὰρ καὶ σοφίας φάρμακον ηὑρέθη Plato Phaedrus 274e-275b “This invention, O king,” said Theuth, “will make the Egyptians wiser and will improve their memories; for it is an elixir of memory and competence that I have discovered.” But Thamus replied, “Most ingenious Theuth, one man has the ability to beget arts, but the ability to judge of their usefulness or harmfulness to their users belongs to another; and now you, who are the father of letters, have been led by your affection to ascribe to them a power the opposite of that which they really possess. For this invention will produce forgetfulness in the minds of those who learn to use it, πίστιν
  87. 87. because they will not practice their memory. Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them. You have invented an elixir not of memory, but of reminding; and you offer your pupils the appearance of competence, not true competence, for they will read many things without instruction and will therefore seem to know many things, when they are for the most part ignorant and hard to get along with, since they are not expert, but only appear to be expert. Plato Phaedrus 274e-275b “This invention, O king,” said Theuth, “will make the Egyptians wiser and will improve their memories; for it is an elixir of memory and competence that I have discovered.” But Thamus replied, “Most ingenious Theuth, one man has the ability to beget arts, but the ability to judge of their usefulness or harmfulness to their users belongs to another; and now you, who are the father of letters, have been led by your affection to ascribe to them a power the opposite of that which they really possess. For this invention will produce forgetfulness in the minds of those who learn to use it, [...] you might think written words spoke as if they had intelligence, but if you question them, wishing to know about their sayings, they always say only one and the same thing.σημαίνει μόνον ταὐτὸν ἀεί
  88. 88. because they will not practice their memory. Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them. You have invented an elixir not of memory, but of reminding; and you offer your pupils the appearance of competence, not true competence, for they will read many things without instruction and will therefore seem to know many things, when they are for the most part ignorant and hard to get along with, since they are not expert, but only appear to be expert. Plato Phaedrus 274e-275b “This invention, O king,” said Theuth, “will make the Egyptians wiser and will improve their memories; for it is an elixir of memory and competence that I have discovered.” But Thamus replied, “Most ingenious Theuth, one man has the ability to beget arts, but the ability to judge of their usefulness or harmfulness to their users belongs to another; and now you, who are the father of letters, have been led by your affection to ascribe to them a power the opposite of that which they really possess. For this invention will produce forgetfulness in the minds of those who learn to use it, Thus, explanability could not be enough, without a greater awareness of automation bias (above all epistemic sclerosis), the adoption of an evidence- and value-oriented assessment of ML, and shifting focus from computation to interaction.
  89. 89. Does the current use of ML power enable such a revolution or rather hinder it?
  90. 90. THANKS! Thanks to Carlo Batini, Angela Locoro, Gian Franco Gensini, Camilla Alderighi and, above all, Raffaele Rasoini for the valuable comments and advice on these points. Any comment please write me a message on Researchgate, Linkedin or email: cabitza @ disco.unimib.it
  91. 91. These slides are not intended nor produced to be made publicly available. However, I share them to a limited readership (those who possess the Web address where the slides are stored) in an effort of disseminating scientific ideas and reflections, with no intended aim of economic gain, and to get feedback and comments on those ideas, which I expresss under a CC-BY 4.0 license. I took the greatest possible care to identify all image copyright holders correctly. However, if I have omitted to do so in some individual instances, I would be most grateful if these copyright holders would inform me forthwith. It is my policy to immediately remove, upon notification and identification, any specific image displayed on this website for which the copyright holder deems the fair use (see below) cannot be associated. Upon request, I will remove immediately the content and update the content accordingly. FAIR USE NOTICE These slides may contain copyrighted (©) material the use of which has not always been specifically authorized by the copyright owner. Such material is made available to advance understanding of ecological, political, historical, human rights, economic, artistic expression, democracy, scientific, moral, ethical, and social justice issues, etc. It is believed that this constitutes a ’fair use’ of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, this material is distributed without profit to those who have expressed a prior general interest in receiving similar information for research and educational purposes. ** COPYRIGHT NOTICE** In accordance with Title 17 U.S.C. Se

×