Multiple representations talk, Middlesex University. February 23, 2018
Multiple representations and visual mental
imagery in artiﬁcial cognitive systems
Reader in Cognitive Science
Department of Psychology
February 23, 2018
Outline of the talk
Visual mental imagery
Cognitive architectures & the Common Model of Cognition
Multiple representations in cognitive architectures
It’s been 26 years. . .
1989–1992 Philosophy & AI
Enﬁeld (Ponders End)
Three AI staff
Lots of Prolog
Quite a bit of Pop-11
A bit of context and a caveat. . .
PhD, Uni. Birmingham
Postdoc, Uni. Nottingham
(Peter Cheng & Nigel
University of Huddersﬁeld
Reasoning with external
This talk relates to ongoing
cognition’ project with Peter
Paper at recent AAAI
workshop ‘A Standard
Model of the Mind’ (Peebles
& Cheng, 2017)
Still a work in progress and
my thinking is not fully
Modern human cognition is multi-representational
External (task environment) representations:
Languages (natural and formal)
Menus and tool bars in computer applications
Specialised abstract notation systems in academic and
Internal mental representations
Abstract, ‘amodal’, descriptive propositional representations
Depictive representations grounded in perception
Preserve explicitly information about topological and geometric
relations among problem components.
Information format, operators, information indexing methods,
heuristics and goal structures can differ considerably with
alternative representations (Larkin & Simon, 1987).
Imagine a square with sides of one unit. At opposite corners of
the square add circles with radii 2
3 of a unit centred on the
corners. Do the two circles:
2. just touch?
3. not touch?
1. “According to Pythagoras’ theorem, the length of the diagonal
between the two opposite corners is the square root of 2”.
2. “That’s about 1.4 units so if we divide that by 2, the centre of
the square is about 0.7 units from each corner”.
3. “The radius of each circle is about 0.66 units, so neither
circle’s perimeter will reach the centre of the square”.
4. “Therefore the circles do not touch”.
1. “I’m imagining a square and I can
see that circles with unit radius will
deﬁnitely overlap; in fact they
intersect each other at the other
corners of the square”.
2. “Now I’m imagining circles of 1
and I can see that they clearly
don’t meet. In fact the circles cross
the mid-point of each side of the
square and curve away from the
3. “Now I’m thinking of circles with
radii of 2
3 . It’s hard to be certain
how big they should be, but they
seem to just touch each other”.
Visual mental imagery and visual working memory
Two solutions rely upon different mental representations
Declarative and mathematical
Visuo-spatial and exploiting imagery in the mind’s eye.
Visual Mental Imagery (VMI). “Representations that produce
the experience of seeing in the absence of visual input”
Pylyshyn All thoughts, including VMI, are propositional.
Kosslyn VMI is an internal, non-perceptual visual experience
caused either by recollecting or conceptualising something.
VMI are structurally analogous to visual representations, and
are caused, at least in part, by psychological processes
shared with the visual system.
VMI has a functional role in planning, (e.g., simulating actions,
particularly when potential costs of error are high).
Processes involved in using visual mental imagery
Generation (from knowledge in LTM)
Inspection, scanning (attention)
Transformation and manipulation
Restructuring and reinterpretation
Composition (e.g., intersection, union, subtraction)
What form of internal representation allows these
computational processes to be carried out efﬁciently?
Symbolic/numerical or array-based?
More general questions
How can information from different senses, at different levels
of abstraction, be ﬂuidly used in decision making?
What functional role does specialised spatial and visual
processing play in cognition?
How are spatial, visual and abstract symbolic representations
and processes integrated?
What forms of representation are required (necessary and
sufﬁcient) to support human-level capabilities and
Do visual and spatial cognition (and visual imagery) demand
non-symbolic, depictive representational formats and
Originated in 1950s but active research programme in 1980s.
Cognitive science – differs from mainstream “narrow” AI and
traditional “divide and conquer” approach of experimental
Theories of the core, immutable structures and processes of
the human cognitive system.
Aim: general, human level intelligence modelling human
cognition and performance – broad applicability to wide range
Addresses fundamental question of how cognitive, perceptual,
and motor processes interact and integrate to produce
complex, real-world behaviour.
Not simply theoretical constructs but actual running software
systems, often with vision and motor control.
An emerging standard model
Several cognitive architectures in existence (20–30)
Two dominant: ACT-R and Soar (both approx. 30 years old).
Much consolidation and convergence over last decade.
‘Common Model of Cognition’ (Laird, Lebiere & Rosenbloom,
Symbolic approaches to spatial reasoning
Architectures come from traditional symbolic AI tradition
Often ad hoc mechanisms not intrinsic to the architecture
Most use only descriptive representations
CogSketch (Forbus, Usher, Lovett, Lockwood & Wetzel, 2011)
Diagram Representation System (DRS) (Chandrasekaran,
Kurup, Banerjee, Josephson & Winkler, 2004)
ACT-R (Peebles, 2013; Peebles & Cheng, 2003)
Plant CO2 Uptake as a function of Plant Type and Treatment
Attempts at array-based representations
Some architectures have explored non-symbolic, array-based
Computation with Multiple Representations (CaMeRa) model
(Tabachneck-Schijf, Leonardo & Simon, 1997)
Retinotopic Reasoning (R2) architecture
Aims to model the computational properties of mental imagery
(Kunda, McGreggor & Goel, 2013).
Based, in large part, on array based (non-symbolic)
representations and operators.
Successfully applied to: (a) Raven’s Progressive Matrices
test, (b) Embedded Figures test, (c) Block Design test, and (d)
Paper Folding test.
Retinotopic Reasoning (R2) architecture
Similar to CaMeRa (Tabachneck-Schijf et al., 1997)
Similarities in these non-symbolic approaches
Employ representations consisting of two-dimensional arrays
Operators to manipulate array objects.
Forward and backward connections to higher-level (numerical,
symbolic) representations (CaMeRa, Soar/SVS)
Important in that they allow cognitive modelling of processes
akin to those used in visual mental imagery.
None explicitly address the issue of multiple representation
Processes involved in using multiple representations
Initial selection of representations
Coordination of simultaneous representations
Switching asynchronously between representations
Distribution of task information between representations,
across task sub-goals and time
Understanding computational costs of each representation
Potential for cognitive off-loading
User’s familiarity with each representation
Compatibility of different representations
Metacognitive knowledge and processes
Monitoring and control processes to handle the selection and
monitoring of, transitions between, and integration of different
Meta-level information about the characteristics of different
representational formats (e.g., level of precision afforded,
ease of computation, suitability for a given problem etc.).
Use–and be able to choose between–alternative
representations within the same modality (e.g., different types
ACT-R (Anderson, 2007) purely symbolic
Visual/Spatial information represented as numbers and
Insufﬁcent to model visual mental imagery
Array module currently under development
Implications for cognitive models and artiﬁcial
1. Models must incorporate alternative problem representations.
2. Must incorporate some form of meta-cognitive monitoring and
control processes to handle the selection and monitoring of,
transitions between, and integration of different
3. Must be able to incorporate meta-level information about the
characteristics of different representational formats (e.g., level
of precision afforded, ease of computation, suitability for a
given problem etc.).
4. Must also be able to incorporate–and be able to choose
betweenalternative representations within the same modality.
Anderson, J. R. (2007). How can the human mind occur in the
physical universe? New York, NY: Oxford University Press.
Chandrasekaran, B., Kurup, U., Banerjee, B., Josephson, J. R. &
Winkler, R. (2004). An architecture for problem solving with
diagrams [Lecture notes in artiﬁcial intelligence 2980]. In A.
Blackwell, K. Marriott & A. Shimojima (Eds.), Diagrammatic
representation and inference (pp. 235–256). Berlin:
Forbus, K., Usher, J., Lovett, A., Lockwood, K. & Wetzel, J. (2011).
CogSketch: Sketch understanding for cognitive science
research and for education. Topics in Cognitive Science,
Kunda, M., McGreggor, K. & Goel, A. K. (2013). A computational
model for solving problems from the Raven’s Progressive
Matrices intelligence test using iconic visual representations.
Cognitive Systems Research, 22, 47–66.
Laird, J. E., Lebiere, C. & Rosenbloom, P. S. (2017). A standard
model of the mind: Toward a common computational
framework across artiﬁcial intelligence, cognitive science,
neuroscience, and robotics. AI Magazine. 38(4).
Larkin, J. H. & Simon, H. A. (1987). Why a diagram is (sometimes)
worth ten thousand words. Cognitive Science, 11, 65–100.
Lathrop, S. D., Wintermute, S. & Laird, J. E. (2011). Exploring the
functional advantages of spatial and visual cognition from an
architectural perspective. Topics in Cognitive Science, 3(4),
Peebles, D. (2013). Strategy and pattern recognition in expert
comprehension of 2×2 interaction graphs. Cognitive
Systems Research, 24, 43–51.
Peebles, D. & Cheng, P. C.-H. (2003). Modeling the effect of task
and graphical representation on response latency in a graph
reading task. Human Factors, 45, 28–45.
Peebles, D. & Cheng, P. C.-H. (2017, September 11). Multiple
representations in cognitive architectures. In AAAI Fall
Symposium 2017: ‘‘A Standard Model of the Mind”.
FS-17-01–FS-17-05. American Association for the
Advancement of Artiﬁcial Intelligence. Washington, VA.
Tabachneck-Schijf, H. J. M., Leonardo, A. M. & Simon, H. A.
(1997). CaMeRa: A computational model of multiple
representations. Cognitive Science, 21, 305–350.
Wintermute, S. (2012). Imagery in cognitive architecture:
Representation and control at multiple levels of abstraction.
Cognitive Systems Research, 19, 1–29.