Advertisement
Advertisement

More Related Content

Similar to Machine Reasoning at A2I2, Deakin University(20)

Advertisement
Advertisement

Machine Reasoning at A2I2, Deakin University

  1. 24/06/2020 1 Geelong, April 2020 A/Prof Truyen Tran Vuong Le, Thao Le, Hung Le, Dung Nguyen, Tin Pham Deakin University @truyenoz truyentran.github.io truyen.tran@deakin.edu.au letdataspeak.blogspot.com goo.gl/3jJ1O0 Machine reasoning @A2I2 linkedin.com/in/truyen-tran
  2. 24/06/2020 2 Dual-process theories The problem Neural memories Agenda Social reasoning The road ahead Visual reasoning
  3. Source: PwC AI trajectory
  4. 24/06/2020 4 Among the most challenging scientific questions of our time are the corresponding analytic and synthetic problems: • How does the brain function? • Can we design a machine which will simulate a brain? -- Automata Studies, 1956.
  5. What does brain do? Perceive the world Conceptualize Reason about things Imagine the future Plan what to do Act Learn Think about self Have emotion Be conscious Love Be happy and pursue happiness Be ethical Socialize Which ones are Turing computational?
  6. 24/06/2020 6 #REF: A Cognitive Architecture Approach to Interactive Task Learning, John E. Laird, University of Michigan
  7. 1960s-1990s  Hand-crafting rules, domain-specific, logic- based  High in reasoning  Can’t scale.  Fail on unseen cases. 24/06/2020 7 2020s-2030s  Learning + reasoning, general purpose, human-like  Has contextual and common- sense reasoning  Requires less data  Adapt to change  Explainable 1990s-present  Machine learning, general purpose, statistics-based  Low in reasoning  Needs lots of data  Less adaptive  Little explanation Photo credit: DARPA
  8. Many facets of reasoning Analogical Relational Inductive Deductive Abductive Judgemental Causal Legal Scientific Moral Social Visual Lingual Medical Musical Problem solving Theorem proving One-shot learning Zero-shot learning Counterfactual
  9. Learning vs Reasoning Learning is to improve itself by experiencing ~ acquiring knowledge & skills Reasoning is to deduce knowledge from previously acquired knowledge in response to a query (or a cues) Early theories of intelligence (a) focuses solely on reasoning, (b) learning can be added separately and later! (Khardon & Roth, 1997). Learning precedes reasoning or the two interacting? Can reasoning be learnt? Are learning and reasoning indeed different facets of the same mental/computational process? 24/06/2020 9 Khardon, Roni, and Dan Roth. "Learning to reason." Journal of the ACM (JACM) 44.5 (1997): 697-725. (Dan Roth; ACM Fellow; IJCAI John McCarthy Award)
  10. Bottou’s vision Is not necessarily achieved by making logical inferences Continuity between algebraically rich inference and connecting together trainable learning systems Central to reasoning is composition rules to guide the combinations of modules to address new tasks 24/06/2020 10 “When we observe a visual scene, when we hear a complex sentence, we are able to explain in formal terms the relation of the objects in the scene, or the precise meaning of the sentence components. However, there is no evidence that such a formal analysis necessarily takes place: we see a scene, we hear a sentence, and we just know what they mean. This suggests the existence of a middle layer, already a form of reasoning, but not yet formal or logical.” Bottou, Léon. "From machine learning to machine reasoning." Machine learning 94.2 (2014): 133-149.
  11. Learning to reason, formal def 24/06/2020 11 Khardon, Roni, and Dan Roth. "Learning to reason." Journal of the ACM (JACM) 44.5 (1997): 697-725. E.g., given a video f, determines if the person with the hat turns before singing.
  12. 12 What color is the thing with the same size as the blue cylinder? blue • The network guessed the most common color in the image. • Linguistic bias. • Requires multi-step reasoning: find blue cylinder ➔ locate other object of the same size ➔ determine its color (green). A testbed: Visual QA
  13. 13Adapted from (Aditya et al., 2019) Events, Activities Objects, Regions, Trajectories Scene Elements: Volumes, 3D Surfaces Image Elements: Regions, Edges, Textures Knowledge about shapes Qualitative spatial reasoning Relations between objects Commonsense knowledge Relations between objects Why Visual QA?
  14. 24/06/2020 14 Dual-process theories The problem Neural memories Agenda Social reasoning The road ahead Visual reasoning
  15. Dual-process system with memories 15 System 1: Intuitive System 1: Intuitive System 1: Intuitive • Fast • Implicit/automatic • Pattern recognition • Multiple System 2: Analytical • Slow • Deliberate/rational • Careful analysis • Single, sequential Facts Semantics Events and relational associations Working space – temporal buffer Single
  16. 24/06/2020 16 Evans, Jonathan St BT. "Dual-processing accounts of reasoning, judgment, and social cognition." Annu. Rev. Psychol. 59 (2008): 255-278.
  17. System 2 Holds hypothetical thought Decoupling from representation Working memory size is not essential. Its attentional control is. Computation selection? 24/06/2020 17
  18. Better: A dual tri-process theory Stanovich, K. E. (2009). Distinguishing the reflective, algorithmic, and autonomous minds: Is it time for a tri-process theory. In two minds: Dual processes and beyond, 55-88. 24/06/2020 18 Photo credit: mumsgrapevine
  19. 24/06/2020 19 Dual-process theories The problem Neural memories Agenda Social reasoning The road ahead Visual reasoning
  20. Visual recognition is a (solvable) special case 20 Image courtesy: https://dcist.com/ What is this? a cat R-CNN, YOLO,… Object recognition Object detection Where are objects, and what are they?
  21. But Image QA is challenging 21 (CLERV, Johnson et al., 2017) (Q) How many objects are either small cylinders or metal things? (Q) Are there an equal number of large things and metal spheres? (GQA, Hudson et al., 2019) (Q) What is the brown animal sitting inside of? (Q) Is there a bag to the right of the green door?
  22. And Video QA is even more challenging 22(Data: TGIF-QA)
  23. 23 Reasoning Computer Vision Natural Language Processing Machine learning Visual Reasoning Visual QA is a great testbed
  24. Reasoning over visual modality • Context:  Videos/Images: compositional data  Hierarchy of visual information: objects, actions, attributes, relations, commonsense knowledge.  Relation of abstract objects: spatial, temporal, semantic aspects. • Research questions:  How to represent object/action relations in images and videos?  How to reason with object/action relations? 24
  25. A simple VQA framework that works surprisingly well 25 Visual representation Question What is the brown animal sitting inside of? Textual representation Fusion Predict answer Answer box CNN Word embedding + LSTM
  26. 24/06/2020 26 A Prelim Dual-System Model Le, Thao Minh, Vuong Le, Svetha Venkatesh, and Truyen Tran. "Neural Reasoning, Fast and Slow, for Video Question Answering." IJCNN (2020).
  27. Separate reasoning process from perception 27  Video QA: inherent dynamic nature of visual content over time.  Recent success in visual reasoning with multi-step inference and handling of compositionality. System 1: visual representation System 2: High- level reasoning
  28. 24/06/2020 28 Separate reasoning process from perception (2)
  29. System 1: Clip-based Relation Network 29 For 𝑘𝑘 = 2,3, … , 𝐾𝐾 where ℎ𝜙𝜙 and 𝑔𝑔𝜃𝜃 are linear transformations with parameters 𝜙𝜙 and 𝜃𝜃, respectively, for feature fusion. Why temporal relations? • Situate an event/action in relation to events/actions in the past and formulate hypotheses on future events. • Long-range sequential modeling.
  30. System 2: MAC Net 24/06/2020 30 Hudson, Drew A., and Christopher D. Manning. "Compositional attention networks for machine reasoning." ICLR 2018.
  31. Results on SVQA dataset. 31 Model Exist Count Integer Comparison Attribute Comparison Query All More Equal Less Color Size Type Dir Shape Color Size Type Dir Shape SA(S) 51.7 36.3 72.7 54.8 58.6 52.2 53.6 52.7 53.0 52.3 29.0 54.0 55.7 38.1 46.3 43.1 TA- GRU(T) 54.6 36.6 73.0 57.3 57.7 53.8 53.4 54.8 55.1 52.4 22.0 54.8 55.5 41.7 42.9 44.2 SA+TA 52.0 38.2 74.3 57.7 61.6 56.0 55.9 53.4 57.5 53.0 23.4 63.3 62.9 43.2 41.7 44.9 CRN+M AC 72.8 56.7 84.5 71.7 75.9 70.5 76.2 90.7 75.9 57.2 76.1 92.8 91.0 87.4 85.4 75.8
  32. Results on TGIF-QA dataset. 32 Model Action Trans. FrameQA Count ST-TP 62.9 69.4 49.5 4.32 Co-Mem 68.2 74.3 51.5 4.10 PSAC 70.4 76.9 55.7 4.27 CRN+MAC 71.3 78.7 59.2 4.23
  33. 24/06/2020 33 A System 1 for Video Thao Minh Le, Vuong Le, Svetha Venkatesh, and Truyen Tran, “Hierarchical conditional relation networks for video question answering”, CVPR'20. System 1: visual representation System 2: High-level reasoning
  34. Conditional Relation Network Unit 34 Problems: • Lack of a generic mechanism for modeling the interaction of multimodal inputs. • Flaws in temporal relations of breaking local motion in videos. Inputs: • An array of 𝑛𝑛 objects • Conditioning feature Output: • An array of 𝑚𝑚 objects
  35. Hierarchical Conditional Relation Networks 35
  36. Results 36 Model Action Trans. FrameQA Count ST-TP 62.9 69.4 49.5 4.32 Co-Mem 68.2 74.3 51.5 4.10 PSAC 70.4 76.9 55.7 4.27 HME 73.9 77.8 53.8 4.02 HCRN 75.0 81.4 55.9 3.82 TGIF-QA dataset MSVD-QA and MSRVTT-QA datasets.
  37. 24/06/2020 37 A System 2 for Relational Reasoning Thao Minh Le, Vuong Le, Svetha Venkatesh, and Truyen Tran, “Dynamic Language Binding in Relational Visual Reasoning”, IJCAI’20. System 1: visual representation System 2: High-level reasoning
  38. Reasoning with structured representation of spatial relation 38 • Key insight: Reasoning is chaining of relational predicates to arrive at a final conclusion → Needs to uncover spatial relations, conditioned on query → Chaining is query-driven → Objects/language needs binding → Object semantics is query-dependent → Very thing is end-to-end differentiable
  39. 39 Language-binding Object Graph Network for VQA
  40. 40 Language-binding Object Graph Unit (LOG)
  41. 24/06/2020 41
  42. Results in CLEVR dataset 42
  43. 43 Comparison with the on CLEVR dataset of different data fractions. Performance comparison on VQA v2 subset of long questions. Model Action XNM(Objects) 43.4 MAC(ResNet) 40.7 MAC(Objects) 45.5 HCRN(Objects) 46.8
  44. 24/06/2020 44
  45. 24/06/2020 45 Dual-process theories The problem Neural memories Social reasoning The road ahead Visual reasoning Agenda
  46. “[…] one cannot hope to understand reasoning without understanding the memory processes […]” (Thompson and Feeney, 2014) 24/06/2020 46
  47. Dual-process system with memories 47 System 1: Intuitive System 1: Intuitive System 1: Intuitive • Fast • Implicit/automatic • Pattern recognition • Multiple System 2: Analytical • Slow • Deliberate/rational • Careful analysis • Single, sequential Facts Semantics Events and relational associations Working space – temporal buffer Single
  48. From psychology & cognitive science Reasoning and memory are studied separately, but …  Analytic reasoning requires working memory  Spatial reasoning requires place memory  Inferential reasoning requires episodic memory Dual-processes in memory:  Fast retrieval/recognition (using similarity as criteria)  Slow item recollection Dual-processes in reasoning  Induction as fast (using similarity as criteria)  Deduction as slow
  49. Reasoning as memory Expert reasoning was enabled by a large long-term memory, acquired through experience Working memory for analytic reasoning  WM capacity predicts IQ  Higher order cognition = creating & manipulating relations  representation of premises, temporarily stored in WM  WM is a system to support information binding to a coordinate system  Reasoning as deliberative hypothesis testing  memory-retrieval based hypothesis generation Knowledge structures support reasoning Fuzzy-trace theory: gist ( intuition) and verbatim memory ( analytic processing) Memory is critical for episodic future thinking (mental simulation) 24/06/2020 49
  50. 24/06/2020 50 #REF: A Cognitive Architecture Approach to Interactive Task Learning, John E. Laird, University of Michigan Knowledge graphs Planning, RL NTM Attention, RL RLUnsupervised CNN Episodic memory
  51. 24/06/2020 51 Learning a Turing machine  Can we learn a (neural) program that learns to program from data? Visual reasoning is a specific program of two inputs (visual, linguistic)
  52. Neural Turing machine (NTM) A memory-augmented neural network (MANN) A controller that takes input/output and talks to an external memory module. Memory has read/write operations. The main issue is where to write, and how to update the memory state. All operations are differentiable. Source: rylanschaeffer.github.io
  53. MANN for reasoning Three steps:  Store data into memory  Read query, process sequentially, consult memory  Output answer Behind the scene:  Memory contains data & results of intermediate steps  LOGNet does the same, memory consists of object representations Drawbacks of current MANNs:  No memory of controllers  Less modularity and compositionality when query is complex  No memory of relations  Much harder to chain predicates. 24/06/2020 53 Source: rylanschaeffer.github.io
  54. 24/06/2020 54 Universal Neural Turing Machines Hung Le, Truyen Tran, Svetha Venkatesh, “Neural stored-program memory”, ICLR'20.
  55. Computing devices vs neural counterparts FSM (1943) ↔ RNNs (1982) PDA (1954) ↔ Stack RNN (1993) TM (1936) ↔ NTM (2014) The missing piece: A memory to store programs Neural stored-program memory UTM/VNA (1936/1945) ↔ NUTM--ours (2019)
  56. NUTM = NTM + NSM
  57. Question answering (bAbI dataset) Credit: hexahedria
  58. Neural Stored-Program Architecture (NSA) -- Work in progress Flexible programs are essential for intelligence Cognitive science: Meta-level process controls object-level process Instance-specific program vs Task-specific program In QA, program is query-specific  On-demand program synthesis Idea: A program to generate programs Previous works: Fast weights, HyperNetworks, Neural Module Networks, Adaptive Mixture of Experts Current work: Programs are generated by combining basis sub- programs  The hope: Basis programs span the space of all possible programs 24/06/2020 58
  59. NSA (cont.) 24/06/2020 59 The memory regularization
  60. 24/06/2020 60 Neural relational memory Hung Le, Truyen Tran, Svetha Venkatesh, “Self-attentive associative memory”, ICML'20.
  61. Failures of current MANNs for reasoning Relational representation is NOT stored  Can’t reuse later in the chain A single memory of items and relations  Can’t understand how relational reasoning occurs The memory-memory relationship is coarse since it is represented as either dot product, or weighted sum. 24/06/2020 61
  62. Self-attentive associative memories (SAM) Learning relations automatically over time 24/06/2020 62
  63. Multi-step reasoning over graphs 24/06/2020 63
  64. Reasoning on text 24/06/2020 64
  65. 24/06/2020 65 Dual-process theories The problem Neural memories Agenda Social reasoning The road ahead Visual reasoning
  66. Contextualized recursive reasoning Thus far, QA tasks are straightforward and objective: Questioner: I will ask about what I don’t know. Answerer: I will answer what I know. Real life can be tricky, more subjective: Questioner: I will ask only questions I think they can answer. Answerer 1: This is what I think they want from an answer. Answerer 2: I will answer only what I think they think I can.  We need Theory of Mind to function socially. 24/06/2020 66
  67. 24/06/2020 67 Theory of Mind is the ability to attribute mental states -- beliefs, intents, desires, emotions, knowledge, etc. - to oneself and to others. Wikipedia, 09/04/2020 Source: religious studies project
  68. 24/06/2020 68 Theory of mind in guilt aversion MARL Dung Nguyen, Truyen Tran, Svetha Venkatesh, “Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning”, ICLR’20 Workshop on Bridging AI and Cognitive Science. Credit: Office of Public Affairs
  69. Social dilemma: Stag Hunt games Difficult decision: individual outcomes (selfish) or group outcomes (cooperative).  Together hunt Stag (both are cooperative): Both have more meat.  Solely hunt Hare (both are selfish): Both have less meat.  One hunts Stag (cooperative), other hunts Hare (selfish): Only one hunts hare has meat. Human evidence: Self-interested but considerate of others (cultures vary). Idea: Belief-based guilt-aversion  One experiences loss if it lets other down.  Necessitates Theory of Mind: reasoning about other’s mind.
  70. Theory of Mind Agent with Guilt Aversion (ToMAGA) Update Theory of Mind  Predict whether other’s behaviour are cooperative or uncooperative  Updated the zero-order belief (what other will do)  Update the first-order belief (what other think about me) Guilt Aversion  Compute the expected material reward of other based on Theory of Mind  Compute the psychological rewards, i.e. “feeling guilty”  Reward shaping: subtract the expected loss of the other.
  71. 24/06/2020 71
  72. The benefit of ToM1 model in guilt aversion ● The color shows the probability of following cooperative behaviour after 500 interactions. ● The x-axis (y-axis) represents the probability that the first (second) player will follow cooperative strategy at the beginning. GA agents without ToM ToMAG A
  73. Deep RL ToMAGA in Grid-world Stag Hunt Games Agents are initialised nearby the Stag → Learning to hunt Stag easier Case 1 Agents are initialised nearby the Hare → Learning to catch Hare easier Case 2
  74. 24/06/2020 74 Dual-process theories The problem Neural memories Social reasoning The road ahead Visual reasoning Agenda
  75. Recall: Many facets of reasoning Analogical Relational Inductive Deductive Abductive Judgemental Causal Legal Scientific Moral Social Visual Lingual Medical Musical Problem solving Theorem proving One-shot learning Zero-shot learning Counterfactual
  76. Yet to be solved … Common-sense reasoning Reasoning as program synthesis with callable, reusable modules Systematicity, aka systematic generalization Knowledge-driven VQA, knowledge as semantic memory  Differentiable neural-symbolic systems for reasoning Visual dialog  Active question asking Higher-order thought (e.g., self-awareness and consciousness)  Need a better prior for reasoning 24/06/2020 76
  77. 24/06/2020 77 Basic circuits Better priors Systematicity Distal events Multi-source Graph reasoning Meta-reasoning Neural-symbolic Clinical reasoning Long-form video Free-energy principle Lifelong reasoning Curriculum planning Temporal reasoning Social reasoning Moral reasoning Program synthesis Memory Recursion Knowledge-driven Emotion Cognitive architecture AGI Visual Turing test
  78. It is easy to get lost looking at the current state of research… 24/06/2020 78 Vietnam News AAAI’20
  79. Rich Sutton’s Bitter Lesson 24/06/2020 79 “The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. ” “The two methods that seem to scale arbitrarily in this way are search and learning.” http://www.incompleteideas.net/IncIdeas/BitterLesson.html
  80. In search for new basic operators  Neuron as feature detector  SENSOR, FILTER  Multiplicative gates  AND gate, Transistor, Resistor  Attention mechanism  SWITCH gate  Memory + forgetting  Capacitor + leakage  Skip-connection  Short circuit  Computational graph  Circuit  Compositionality  Modular design  ..  Reasoning  Programmable hardware (e.g., FPGA)? 24/06/2020 80
  81. 24/06/2020 81 Goertzel, B., Iklé, M., & Wigmore, J. (2012). The architecture of human-like general intelligence. In Theoretical Foundations of Artificial General Intelligence (pp. 123-144). Atlantis Press, Paris.l,’;[p/
  82. AGI 24/06/2020 82 From: Alexey Potapov, "Cross-Paradigm AGI: How Cognitive, Deep, Probabilistic and Universal AI can Contribute to Each Other“, AGI’17, Melbourne, Australia
  83. 24/06/2020 83 Thank you Truyen Tran @truyenoz truyentran.github.io truyen.tran@deakin.edu.au letdataspeak.blogspot.com goo.gl/3jJ1O0 linkedin.com/in/truyen-tran We’re recruiting for PhD. Full scholarships available. URL: truyentran.github.io/scholarship.html
Advertisement