ABSTRACT
Very many researchers assume that it is obvious what vision (e.g. in humans) is for, i.e. what functions it has, leaving only the problem of explaining how those functions are fulfilled. So they postulate mechanisms and try to show how those mechanisms can produce the required effects, and also, in some cases, try to show that those postulated mechanisms exist in humans and other animals and perform the postulated functions. The main point of this presentation is that it is far from obvious what vision is for - and J.J. Gibson's main achievement is drawing attention to some of the functions that other researchers had ignored. I'll present some of the other work, show how Gibson extends and improves it, and then point out much more there is to the functions of vision and other forms of perception than even Gibson had noticed.
In particular, much vision research, unlike Gibson, ignores vision's function in on-line control and perception of continuous processes; and nearly all, including Gibson's work, ignores meta-cognitive perception, and perception of possibilities and constraints on possibilities and the associated role of vision in reasoning. If we don't understand that we cannot understand how biological mechanisms arising from requirements for being embodied in a rich, complex and changing 3-D environment underpin human mathematical capabilities, including the ability to reason about topology and Euclidean geometry.
Last updated: 1st March 2014, 10 June 2015 (additional links)
Music information Retrieval (MIR) is a vastly growing field these days due to the exploding popularity of streaming services like Spotify, Pandora and SoundCloud. At such scale detailed knowledge about your content is necessity. In this lecture we're going to outline the major tasks in MIR as genre and mood detection, artist classification, music similarity, and fingerprinting. Our main focus will be on how Machine Learning and Deep Learning, as a recent trend, could be applied over the audio signal for tackling those problems.
Computer vision has been studied for more than 40 years. Due to the increasingly diverse and rapidly developed topics in vision and the related fields (e.g., machine learning, signal processing, cognitive science), the tasks to come up with new research ideas are usually daunting for junior graduate students in this field. In this talk, I will present five methods to come up with new research ideas. For each method, I will give several examples (i.e., existing works in the literature) to illustrate how the method works in practice.
This is a common sense talk and will not have complicated math equations and theories.
Note: The content of this talk is inspired by "Raskar Idea Hexagon" - Prof. Ramesh Raskar's talk on "How to come up with new Ideas".
To download the presentation slide with videos, please visit
http://jbhuang0604.blogspot.com/2010/05/how-to-come-up-with-new-research-ideas.html
For the video lecture (in Chinese), please visit
http://jbhuang0604.blogspot.com/2010/06/blog-post_14.html
Presentation for tutorial session 'Measuring scholarly impact: Methods and practice' at ISSI2015
Explains how to use linkpred: https://github.com/rafguns/linkpred
Introduction to Game Development and the Game IndustryNataly Eliyahu
Talk about games and the game industry at She Codes meeting at the Weizmann Institute of Science.
Basic introduction to the game industry and what to learn to get into game programming.
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019Unity Technologies
Unity's High Definition Render Pipeline (HDRP) makes it possible for developers to unleash their application's full potential using a custom renderer. With this great power, comes great responsibility; more than ever you need to ensure that your application maintains optimal performance so your users can have the best experience possible. These slides look at using NVIDIA Nsight Graphics to profile and optimize your application to achieve peak performance. See how GPU Trace can help you visualize performance metrics to fully utilize your GPU; and how the brand new Shader Profile can help you to optimize your new HDRP shaders.
This session also showed how to utilize advanced features like the Acceleration Structure Viewer to debug your real-time ray tracing (DXR) application.
Speaker: Aurelio Reis – NVIDIA
Watch the session on YouTube: https://youtu.be/l_LiE1vAFhM
Topics include:
What is world building in game?
Why world building important for gaming?
Why everyone is a worldbuilder?
Principles of worldbuilding?
Ways to improve the world building
How to write world building and some tips
Exercise on world building
발표자: 박태성 (UC Berkeley 박사과정)
발표일: 2017.6.
Taesung Park is a Ph.D. student at UC Berkeley in AI and computer vision, advised by Prof. Alexei Efros.
His research interest lies between computer vision and computational photography, such as generating realistic images or enhancing photo qualities. He received B.S. in mathematics and M.S. in computer science from Stanford University.
개요:
Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs.
However, for many tasks, paired training data will not be available.
We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples.
Our goal is to learn a mapping G: X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.
Because this mapping is highly under-constrained, we couple it with an inverse mapping F: Y → X and introduce a cycle consistency loss to push F(G(X)) ≈ X (and vice versa).
Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc.
Quantitative comparisons against several prior methods demonstrate the superiority of our approach.
Music information Retrieval (MIR) is a vastly growing field these days due to the exploding popularity of streaming services like Spotify, Pandora and SoundCloud. At such scale detailed knowledge about your content is necessity. In this lecture we're going to outline the major tasks in MIR as genre and mood detection, artist classification, music similarity, and fingerprinting. Our main focus will be on how Machine Learning and Deep Learning, as a recent trend, could be applied over the audio signal for tackling those problems.
Computer vision has been studied for more than 40 years. Due to the increasingly diverse and rapidly developed topics in vision and the related fields (e.g., machine learning, signal processing, cognitive science), the tasks to come up with new research ideas are usually daunting for junior graduate students in this field. In this talk, I will present five methods to come up with new research ideas. For each method, I will give several examples (i.e., existing works in the literature) to illustrate how the method works in practice.
This is a common sense talk and will not have complicated math equations and theories.
Note: The content of this talk is inspired by "Raskar Idea Hexagon" - Prof. Ramesh Raskar's talk on "How to come up with new Ideas".
To download the presentation slide with videos, please visit
http://jbhuang0604.blogspot.com/2010/05/how-to-come-up-with-new-research-ideas.html
For the video lecture (in Chinese), please visit
http://jbhuang0604.blogspot.com/2010/06/blog-post_14.html
Presentation for tutorial session 'Measuring scholarly impact: Methods and practice' at ISSI2015
Explains how to use linkpred: https://github.com/rafguns/linkpred
Introduction to Game Development and the Game IndustryNataly Eliyahu
Talk about games and the game industry at She Codes meeting at the Weizmann Institute of Science.
Basic introduction to the game industry and what to learn to get into game programming.
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019Unity Technologies
Unity's High Definition Render Pipeline (HDRP) makes it possible for developers to unleash their application's full potential using a custom renderer. With this great power, comes great responsibility; more than ever you need to ensure that your application maintains optimal performance so your users can have the best experience possible. These slides look at using NVIDIA Nsight Graphics to profile and optimize your application to achieve peak performance. See how GPU Trace can help you visualize performance metrics to fully utilize your GPU; and how the brand new Shader Profile can help you to optimize your new HDRP shaders.
This session also showed how to utilize advanced features like the Acceleration Structure Viewer to debug your real-time ray tracing (DXR) application.
Speaker: Aurelio Reis – NVIDIA
Watch the session on YouTube: https://youtu.be/l_LiE1vAFhM
Topics include:
What is world building in game?
Why world building important for gaming?
Why everyone is a worldbuilder?
Principles of worldbuilding?
Ways to improve the world building
How to write world building and some tips
Exercise on world building
발표자: 박태성 (UC Berkeley 박사과정)
발표일: 2017.6.
Taesung Park is a Ph.D. student at UC Berkeley in AI and computer vision, advised by Prof. Alexei Efros.
His research interest lies between computer vision and computational photography, such as generating realistic images or enhancing photo qualities. He received B.S. in mathematics and M.S. in computer science from Stanford University.
개요:
Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs.
However, for many tasks, paired training data will not be available.
We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples.
Our goal is to learn a mapping G: X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.
Because this mapping is highly under-constrained, we couple it with an inverse mapping F: Y → X and introduce a cycle consistency loss to push F(G(X)) ≈ X (and vice versa).
Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc.
Quantitative comparisons against several prior methods demonstrate the superiority of our approach.
Game Development is the art of creating games and describes the design, development and release of a game. It may involve concept generation, design, build, test and release. While you create a game, it is important to think about the game mechanics, rewards, player engagement and level design.
There’s a rise in demand for professionals in the field, game development jobs beat any typical 9-5 work, and there are plenty of exciting roles available. You will not only create games but can be immersed in the world of gaming – all in a day at work.
Dive in and learn all about game development!
Siggraph2016 - The Devil is in the Details: idTech 666Tiago Sousa
A behind-the-scenes look into the latest renderer technology powering the critically acclaimed DOOM. The lecture will cover how technology was designed for balancing a good visual quality and performance ratio. Numerous topics will be covered, among them details about the lighting solution, techniques for decoupling costs frequency and GCN specific approaches.
This talk will present a novel technique for the rendering of surfaces covered with fallen deformable snow featured in Batman: Arkham Origins. Scalable from current generation consoles to high-end PCs, as well as next-generation consoles, the technique allows for visually convincing and organically interactive deformable snow surfaces everywhere characters can stand/walk/fight/fall, is extremely fast, has a low memory footprint, and can be used extensively in an open world game. We will explain how this technique is novel in its approach of acquiring arbitrary deformation, as well as present all the details required for implementation. Moreover, we will share the results of our collaboration with NVIDIA, and how it allowed us to bring this technique to the next level on PC using DirectX 11 tessellation.
Optimizing the Graphics Pipeline with Compute, GDC 2016Graham Wihlidal
With further advancement in the current console cycle, new tricks are being learned to squeeze the maximum performance out of the hardware. This talk will present how the compute power of the console and PC GPUs can be used to improve the triangle throughput beyond the limits of the fixed function hardware. The discussed method shows a way to perform efficient "just-in-time" optimization of geometry, and opens the way for per-primitive filtering kernels and procedural geometry processing.
Takeaway:
Attendees will learn how to preprocess geometry on-the-fly per frame to improve rendering performance and efficiency.
Intended Audience:
This presentation is targeting seasoned graphics developers. Experience with DirectX 12 and GCN is recommended, but not required.
If learning maths requires a teacher, where did the first teachers come from?
or
Why (and how) did biological evolution produce mathematicians?
Presentation at Symposium on Mathematical Cognition AISB2010
Part of the Meta-Morphogenesis Project. See also this discussion of toddler theorems:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/toddler-theorems.html
Evolution of human mathematics from earlier abilities to perceived, use and reason about affordances, spatial possibilities and constraints.
The necessity of mathematical truth does not imply infallibility of mathematical reasoning. (Lakatos).
Toddlers discover theorems without knowing it. Later they may learn to reflect on and talk about what they have learnt. Compare Annette Karmiloff-Smith on "Representational re-description".
Why is it still so hard to give robots and AI systems the ability to reason spatially as mathematicians do (except for simple special cases, e.g. where space is discretised.)
Reorganised several times since first uploaded: most recently 25 Jan 2016
-------------------------------------------------------------------------------------------------------
Slides include link to video of lecture (158MB) http://www.cs.bham.ac.uk/research/projects/cogaff/movies/#ailect2-2015
-------------------------------------------------------------------------------------------------------------
Two questions are shown to have deep connections: What are the functions of vision in animals? and How did human languages evolve? The answer given here is that the functions of vision need to be supported by richly structured internal languages (forms of representation used for acquiring, storing, manipulating, deriving and using information), from which it follows that internal languages must have evolved before languages for communication.
---------------------------------------------------------------------------------------------------------------
The account of the functions of vision mentions early AI vision, the impact of Marr and the even greater impact of Gibson, but argues that they did not recognize all the functions of vision, e.g. the uses of vision in making mathematical discoveries leading to Euclid's elements.
---------------------------------------------------------------------------------------------------------------
Many questions are left unanswered by this research, which is part of the Meta-Morphogenesis project, introduced here:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/meta-morphogenesis.html
---------------------------------------------------------------------------------------------------------------
A slideshare presentation on "origins of language" by Jasmine Wong, adds some useful additional evidence, but presents a simpler theory:
http://www.slideshare.net/JasmineWong6/origins-of-language
---------------------------------------------------------------------------------------------------------------
Minor corrections+ additions 30-Mar-2015, 1-Apr-2015, 15-Apr-2015 12-Nov-2015
Game Development is the art of creating games and describes the design, development and release of a game. It may involve concept generation, design, build, test and release. While you create a game, it is important to think about the game mechanics, rewards, player engagement and level design.
There’s a rise in demand for professionals in the field, game development jobs beat any typical 9-5 work, and there are plenty of exciting roles available. You will not only create games but can be immersed in the world of gaming – all in a day at work.
Dive in and learn all about game development!
Siggraph2016 - The Devil is in the Details: idTech 666Tiago Sousa
A behind-the-scenes look into the latest renderer technology powering the critically acclaimed DOOM. The lecture will cover how technology was designed for balancing a good visual quality and performance ratio. Numerous topics will be covered, among them details about the lighting solution, techniques for decoupling costs frequency and GCN specific approaches.
This talk will present a novel technique for the rendering of surfaces covered with fallen deformable snow featured in Batman: Arkham Origins. Scalable from current generation consoles to high-end PCs, as well as next-generation consoles, the technique allows for visually convincing and organically interactive deformable snow surfaces everywhere characters can stand/walk/fight/fall, is extremely fast, has a low memory footprint, and can be used extensively in an open world game. We will explain how this technique is novel in its approach of acquiring arbitrary deformation, as well as present all the details required for implementation. Moreover, we will share the results of our collaboration with NVIDIA, and how it allowed us to bring this technique to the next level on PC using DirectX 11 tessellation.
Optimizing the Graphics Pipeline with Compute, GDC 2016Graham Wihlidal
With further advancement in the current console cycle, new tricks are being learned to squeeze the maximum performance out of the hardware. This talk will present how the compute power of the console and PC GPUs can be used to improve the triangle throughput beyond the limits of the fixed function hardware. The discussed method shows a way to perform efficient "just-in-time" optimization of geometry, and opens the way for per-primitive filtering kernels and procedural geometry processing.
Takeaway:
Attendees will learn how to preprocess geometry on-the-fly per frame to improve rendering performance and efficiency.
Intended Audience:
This presentation is targeting seasoned graphics developers. Experience with DirectX 12 and GCN is recommended, but not required.
If learning maths requires a teacher, where did the first teachers come from?
or
Why (and how) did biological evolution produce mathematicians?
Presentation at Symposium on Mathematical Cognition AISB2010
Part of the Meta-Morphogenesis Project. See also this discussion of toddler theorems:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/toddler-theorems.html
Evolution of human mathematics from earlier abilities to perceived, use and reason about affordances, spatial possibilities and constraints.
The necessity of mathematical truth does not imply infallibility of mathematical reasoning. (Lakatos).
Toddlers discover theorems without knowing it. Later they may learn to reflect on and talk about what they have learnt. Compare Annette Karmiloff-Smith on "Representational re-description".
Why is it still so hard to give robots and AI systems the ability to reason spatially as mathematicians do (except for simple special cases, e.g. where space is discretised.)
Reorganised several times since first uploaded: most recently 25 Jan 2016
-------------------------------------------------------------------------------------------------------
Slides include link to video of lecture (158MB) http://www.cs.bham.ac.uk/research/projects/cogaff/movies/#ailect2-2015
-------------------------------------------------------------------------------------------------------------
Two questions are shown to have deep connections: What are the functions of vision in animals? and How did human languages evolve? The answer given here is that the functions of vision need to be supported by richly structured internal languages (forms of representation used for acquiring, storing, manipulating, deriving and using information), from which it follows that internal languages must have evolved before languages for communication.
---------------------------------------------------------------------------------------------------------------
The account of the functions of vision mentions early AI vision, the impact of Marr and the even greater impact of Gibson, but argues that they did not recognize all the functions of vision, e.g. the uses of vision in making mathematical discoveries leading to Euclid's elements.
---------------------------------------------------------------------------------------------------------------
Many questions are left unanswered by this research, which is part of the Meta-Morphogenesis project, introduced here:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/meta-morphogenesis.html
---------------------------------------------------------------------------------------------------------------
A slideshare presentation on "origins of language" by Jasmine Wong, adds some useful additional evidence, but presents a simpler theory:
http://www.slideshare.net/JasmineWong6/origins-of-language
---------------------------------------------------------------------------------------------------------------
Minor corrections+ additions 30-Mar-2015, 1-Apr-2015, 15-Apr-2015 12-Nov-2015
Cognitive Models- Part 3 of Piero Scaruffi's class "Thinking about Thought" a...piero scaruffi
Cognitive Models - Part 3 of Piero Scaruffi's class "Thinking about Thought" at UC Berkeley (2014), excerpted from a chapter of http://www.scaruffi.com/nature I keep updating these slides at www.scaruffi.com/ucb.html
Slides prepared for a broadcast presentation to members of Computing at School http://www.computingatschool.org.uk/, about why computing education should be about more than the science and technology required for useful or entertaining applications. Instead, learning about forms of information processing systems can give us new, deeper ways of thinking about many old phenomena, e.g. the nature of mind and the evolution of minds of various kinds. This supports the claim that the study of computation is as much a science as physics or psychology, rather than just a branch of engineering -- as famously suggested by Fred Brooks.
Virtuality, causation and the mind-body relationshipAaron Sloman
Extends my previous introductions to virtual machines and their role both in artefacts and products of biological evolution. This attempts to correct various erroneous assumptions about computation, functionalism, supervenience, life, information, and causation. See also http://www.cs.bham.ac.uk/research/projects/cogaff/misc/vm-functionalism.html
How to Build a Research Roadmap (avoiding tempting dead-ends)Aaron Sloman
What's a Research Roadmap For?
Why do we need one?
How can we avoid the usual trap of making bold promises to do X, Y and Z,
then hope that our previous promises will not be remembered the next time we apply for funds to do X, Y and Z?
How can we produce a sensible, well informed roadmap?
Originally presented at the euCognition Research Roadmap discussion in Munich on 12 Jan 2007
This suggests a way to avoid tempting dead ends (repeating old promises that proved unrealistic) by examining many long term goals, including describing existing human and animal competences not yet achieved by robots, then working backwards systematically by investigating requirements for those competences, and requirements for meeting those requirements, etc. Insread of generating a single linear roadmap this should produce a partially ordered network of intermediate targets, leading back, to short term goals that may be achievable starting from where we are.
Such a roadmap will inevitably have mistakes: over-optimistic goals, missing preconditions, unrecognised opportunities. But if the work is done in many teams in a fully open manner with as much collaboration as possible, it should be possible to make faster, deeper, progress than can be achieved by brain-storming discussions of where we can get in a few years.
Enhancing the Design pattern Framework of Robots Object Selection Mechanism -...INFOGAIN PUBLICATION
In order to enable a computer to construct and display a three-dimensional array, solid objects from a single two-dimensional photograph, the rules and assumptions of depth perception have been carefully analyzed and mechanized. It is assumed that a photograph is a perspective projection of a set of objects which can be constructed from transformations of known three-dimensional models, and that the objects are supported by other visible objects or by a ground plane. These assumptions enable a computer to obtain a reasonable, three-dimensional description from the edge information in a photograph by means of a topological, mathematical process. A computer program has been written which can process a photograph into a line drawing .transform the line drawing into a three-dimensional representation and, finally, display the three-dimensional structure with all the hidden lines removed, from any point of view. The 2-D to 3-D construction and 3-D to 2-D display processes are sufficiently general to handle most collections of planar-surfaced objects and provide a valuable starting point for future investigation of computer-aided three-dimensional systems.
PHOTOSYNTH: A 3D Photo Experience! by swami_worldtraveler. This was presented to the Central Florida Computer Society on July 18, 2010. This current version includes additional web links, plus explanatory slide notes. The latter was done since the original slides simply provide an outline, key points, visuals, and things to elaborate on. Feel free to read these, or skip over them as you desire.
A multi-picture challenge for theories of visionAaron Sloman
(Modified 7th June 2013 to include some droodles.)
Some informal experiments are presented whose results help to challenge most theories of vision and proposed mechanisms of vision.
A possible explanatory information-processing architecture is proposed, based on multiple dynamical systems, grown during an individual's life time, most of which are dormant most of the time, but which can be very rapidly activated and instantiated so as to build a multi-ontology interpretation of the currently, and recently, available visual information -- e.g. turning a corner into a busy street in an unfamiliar city. As far as I know, there is no working implementation of such a system, though a very early prototype called Popeye (implemented in Pop2) around 1976 is summarised. Many hard unsolved problems remain, though most of them are ignored by research on vision that makes narrow assumptions about the functions of biological vision.
Is Abstraction the Key to Artificial Intelligence? - Lorenza SaittaWithTheBest
With this comprehensive breakdown of abstraction's multiple layers and components, we can understand and answer the question if abstraction is essential to artificial intelligence.
Lorenza Saitta, Università del Piemonte Orientale
MCN 2013 (ffrench) From Documentation to Discovery: Preservation Photographic...John ffrench
During this panel presentation information was shared on a collaborative project between the Yale University Art Gallery and the Yale CS department. Staff established that significant imaging data potentially crucial to the work of restoring damaged paintings, could be improved by leveraging the combined strengths of multiple modalities. we therefore aimed to undertake a collaborative exploratory project, with the assistance of Post-Doc students in the Computing and the Arts department of Computer Science at Yale, to design new software that would allow these modalities to be used together.
For further information contact the presenter.
11. Define a simple deformable model to detect a half-circular shape.pdffeetshoemart
11. Define a simple deformable model to detect a half-circular shape (may be rotated). What will
be the energy function?
Solution
shape is a recurring theme in computer vision. For example, shape is one of the main sources of
information that can be used for object recognition. In medical image analysis, geometrical
models of anatomical structures play an important role in automatic tissue segmentation. The
shape of an organ can also be used to diagnose diseases. In a completely different setting, shape
plays an important role in the perception of optical illusions (we tend to see particular shapes)
and this can be used to explain how our visual system interprets the ambiguous and incomplete
information available in an image. Our main goal is to develop techniques that can be used to
represent and detect relatively generic objects in images. The techniques we present here revolve
around a particular shape representation, based on the description of objects using triangulated
polygons. Triangulated polygons allow us to describe complex shapes using simple building
blocks. As we show in the next section, the triangles that decompose a polygon without holes are
connected together in a tree structure, and this has important algorithmic consequences. By
picking a particular triangulation for the polygons we obtain decompositions of objects into
meaningful parts. This yields a discrete representation closely related to Blum’s medial axis
transform [6]. In this paper we concentrate on the task of finding the location of a deformable
shape in an image. This problem is important for the recognition of non-rigid objects. Moreover,
objects in many generic classes can be described as deformed versions of an ideal template. In
this setting, the location of an object is given by a continuous map from a template to an image.
Figure 1 illustrates how we use a deformable template to detect a particular anatomical structure
in an MR image. We will show how triangulated polygons provide rich models for deformable
shapes. These models can capture both boundary and interior information of an object and can be
deformed in an intuitive way. Equally important, we present an efficient algorithm for finding
the optimal location of a deformable shape in an image. In contrast, previous methods that take
into account the interior of deformable objects.
The geometric properties of rigid objects are well understood. We know how three dimensional
features such as corners or edges project into images, and there are a number of methods that can
be used to represent rigid shapes and locate their projections. Some techniques, such as the
alignment method [23], use explicit three dimensional representations. Other techniques, such as
linear combination of views [36], capture the appearance of three dimensional shapes using a
small number of two dimensional pictures. These and similar techniques assume that all shape
variation comes from the viewpoint dependency of two dimensiona.
Visual Perception, Quantity of Information Function and the Concept of the Qu...Rushan Ziatdinov
The geometric shapes of the outside world objects hide an undisclosed emotional, psychological, artistic, aesthetic and shape-generating potential; they may attract or cause fear as well as a variety of other emotions. This suggests that living beings with vision perceive geometric objects within an information-handling process. However, not many studies have been performed for a better understanding of visual perception from the view of information theory and mathematical modelling, but the evidence first found by Attneave (1954) suggests that the concepts and techniques of information theory may shed light on a better and deeper understanding of visual perception. The quantity of information function can theoretically explain the concentration of information on the visual contours, and, based on this, we first propose the concept of the quantity of information continuous splines for visualization of shapes from a given set of discrete data without adding any in-between points with curvature extreme. Additionally, we first discover planar curve with a constant quantity of information function and demonstrate one of the conditions when a monotonic curvature curve has a constant quantity of information function.
Visualization designers usually start with good common-sense ideas about human perception, attention and cognition. Often these are now formalized into predictive theories and even computational models. As technology advances speed up, it is easy to assume that we know all about how end users work, forgetting everyday variations in age, expertise level, real-world knowledge and task goals. This talk will consider the adaptability of human visual processing to such factors, and some practical implications for work with geospatial visualizations.
ROBUST STATISTICAL APPROACH FOR EXTRACTION OF MOVING HUMAN SILHOUETTES FROM V...ijitjournal
Human pose estimation is one of the key problems in computer visionthat has been studied in the recent
years. The significance of human pose estimation is in the higher level tasks of understanding human
actions applications such as recognition of anomalous actions present in videos and many other related
applications. The human poses can be estimated by extracting silhouettes of humans as silhouettes are
robust to variations and it gives the shape information of the human body. Some common challenges
include illumination changes, variation in environments, and variation in human appearances. Thus there
is a need for a robust method for human pose estimation. This paper presents a study and analysis of
approaches existing for silhouette extraction and proposes a robust technique for extracting human
silhouettes in video sequences. Gaussian Mixture Model (GMM) A statistical approach is combined with
HSV (Hue, Saturation and Value) color space model for a robust background model that is used for
background subtraction to produce foreground blobs, called human silhouettes. Morphological operations
are then performed on foreground blobs from background subtraction. The silhouettes obtained from this
work can be used in further tasks associated with human action interpretation and activity processes like
human action classification, human pose estimation and action recognition or action interpretation.
Similar to What's vision for, and how does it work? From Marr (and earlier)to Gibson and Beyond (20)
Construction kits for evolving life -- Including evolving minds and mathemati...Aaron Sloman
Darwin's theory of evolution by natural selection does not adequately explain the generative power of biological evolution. For that we need to understand the mechanisms involved in producing new options for natural selection, without which there would always be the same set of possibilities available. This applies also to the construction kits: evolution can produce new construction kits, "Derived" construction kits, based on the Fundamental construction kit provided by the physical universe and its originally lifeless physical and chemical mechanisms. It turns out that life needs both concrete and abstract construction kits, of ever increasing complexity. This paper introduces some basic ideas, though far more empirical and theoretical research is required, combining multiple disciplines. Slideshare no longer allows presentations to be updated, so I no longer use it. For a later version search for: Sloman "Construction kits for evolving life" cogaff. Most of my slideshare presentations have newer versions in the CogAff web site at the University of Birmingham, UK. (Not Alabama)
The Turing Inspired Meta-Morphogenesis Project -- The self-informing universe...Aaron Sloman
This replaces an earlier version. The latest version with clickable links is available at Versions with clickable links available at http://www.cs.bham.ac.uk/research/projects/cogaff/misc/meta-morphogenesis.html
Meta-Morphogenesis, Evolution, Cognitive Robotics and Developmental Cognitive...Aaron Sloman
How could a planet, condensed from a cloud of dust, produce minds -- and products of minds, along with microbes, mice, monkeys, mathematics, music, marmite, murder, megalomania, and all other forms and products of life on earth (and possibly elsewhere)?
This presentation introduces the ambitious, multi-disciplinary Meta-Morphogenesis project, partly inspired by Turing's 1952 paper on morphogenesis. It may lead to an answer, by identifying the many transitions between different types and mechanisms of biological information processing, including transitions that changed the mechanisms of change, altering forms of evolution, development, learning, culture and ecosystem dynamics. One of the questions raised is whether chemical information-processing is capable of supporting processes that would be infeasible or impossible on a Turing machine or conventional computer.
A 2hour 30 min recording of this tutorial was made by Adam Ford, available here: http://www.youtube.com/watch?v=BNul52kFI74 (new version installed on 14 Jun 2013 with titles and audio problem fixed). Also available here
http://www.cs.bham.ac.uk/research/projects/cogaff/movies/#m-m-tut
"Information" here is used in Jane Austen's sense, not Claude Shannon's sense. See http://www.cs.bham.ac.uk/research/projects/cogaff/misc/austen-info.html
More information about the project is available here: http://www.cs.bham.ac.uk/research/projects/cogaff/misc/meta-morphogenesis.html
Adam Ford interviewed the author about some of these topics at the AGI conference in December 2012 in this video: http://www.youtube.com/watch?v=iuH8dC7Snno
Related PDF presentations can be found here http://www.cs.bham.ac.uk/research/projects/cogaff/talks
What is computational thinking? Who needs it? Why? How can it be learnt? ...Aaron Sloman
What is computational thinking?
Who needs it? Why? How can it be learnt?
Can it be taught? How?
Slides for invited presentation at Conference of ALT (Association for Learning Technology) 11th Sept 2012, University of Manchester.
PDF available (easier for printing, selecting text, etc.):
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#talk105
A video of the actual presentation (using no slides because of a projector problem) is now available here
http://www.youtube.com/watch?v=QXAFz3L2Qpo
It also has been made available as "slide 47" after the PDF presentation on this page.
I attempt to generalise Jeannette Wing's notion of "Computational thinking" (ACM 2006) to include attempting to understand much biological information processing, and try to show the necessity for educators to do deep computational thinking if they wish to facilitate processes of learning.
Helping Darwin: How to think about evolution of consciousness (Biosciences ta...Aaron Sloman
ABSTRACT
Many of Darwin's opponents, and some of those who accepted the theory of evolution as regards physical forms, objected to the claim that human mental functions, and
consciousness in particular, could be products of evolution. There were several reasons for this opposition, including unanswered questions as to how physical mechanisms could produce mental states and processes an old, and still surviving, philosophical problem.
A new answer is now available. Evolution could have produced the "mysterious" aspects of consciousness if, like engineers developing computing systems in the last six or seven decades, evolution encountered and "solved" increasingly complex problems of representation and control (including self-monitoring and self-control) by using systems with increasingly abstract mechanisms based on virtual machines, including most
recently self-monitoring virtual machines.
These capabilities are, like many capabilities of computer-based systems, implemented in non-physical virtual machinery which, in turn, are implemented in lower level physical mechanisms.
This would require far more complex virtual machines than human engineers have so far created. Noone knows whether the biological virtual machines could have been
implemented in the discrete-switch technology used in current computers.
These ideas were not available to Darwin and his contemporaries: most of the concepts, and the technology, involved in creation and use of sophisticated virtual machines were developed only in the last half century, as a by-product of a large number of design decisions by hardware and software engineers solving different problems.
Ontologies for baby animals and robots From "baby stuff" to the world of adul...Aaron Sloman
In contrast with ontology developers concerned with a symbolic or digital environment (e.g. the internet), I draw attention to some features of our 3-D spatio-temporal environment that challenge young humans and other intelligent animals and will also challenge future robots. Evolution provides most animals with an ontology that suffices for life, whereas some animals, including humans, also have mechanisms for substantive ontology extension based on results of interacting with the environment. Future human-like robots will also need this. Since pre-verbal human children and many intelligent non-human animals, including hunting mammals, nest-building birds and primates can interact, often creatively, with complex structures and processes in a 3-D environment, that suggests (a) that they use ontologies that include kinds of material (stuff), kinds of structure, kinds of relationship, kinds of process (some of which are process-fragments composed of bits of stuff changing their properties, structures or relationships), and kinds of causal interaction and (b) since they don't use a human communicative language they must use information encoded in some form that existed prior to human communicative languages both in our evolutionary history and in individual development. Since evolution could not have anticipated the ontologies required for all human cultures, including advanced scientific cultures, individuals must have ways of achieving substantive ontology extension. The research reported here aims mainly to develop requirements for explanatory designs. The attempt to develop forms of representation, mechanisms and architectures that meet those requirements will be a long term research project.
Possibilities between form and function (Or between shape and affordances)Aaron Sloman
I discuss the need for an intelligent system, whether it is a robot, or some sort of digital companion equipped with a vision system, to include in its ontology a range of concepts that appear not to have been noticed by most researchers in robotics, vision, and human psychology. These are concepts that lie between (a) concepts of "form", concerned with spatially located objects, object parts, features, and relationships and (b) concepts of affordances and functions, concerned with how things in the environment make possible or constrain actions that are possible for a perceiver and which can support or hinder the goals of the perceiver.
Those intermediate concepts are concerned with processes that *are* occurring and processes that *can* occur, and the causal relationships between physical structures/forms/configurations and the possibilities for and constraints on such processes, independently of whether they are processes involving anyone's actions or goals.
These intermediate concepts relate motions and constraints on motion to both geometric and topological structures in the environment and the kinds of 'stuff' of which things are composed, since, for example, rigid, flexible, and fluid stuffs support and constrain different sorts of motions.
They underlie affordance concepts. Attempts to study affordances without taking account of the intermediate concepts are bound to prove shallow and inadequate.
Notes for invited talk at Dagstuhl Seminar: ``From Form to Function'' Oct 18-23, 2009 http://www.dagstuhl.de/en/program/calendar/semhp/?semnr=09431
Virtual Machines and the Metaphysics of Science Aaron Sloman
Philosophers regularly use complex (running) virtual machines (not virtual realities) composed of enduring interacting non-physical subsystems (e.g. operating systems, word-processors, email systems, web browsers, and many more). These VMs can be subdivided into different kinds with different types of functions, e.g. "specific-function VMs" and "platform VMs" (including language VMs, and operating system VMs) that provide support for a variety of different (possibly concurrent) "higher level" VMs, with different functions.
Yet, almost all ignore (or misdescribe) these VMs when discussing functionalism, supervenience, multiple realisation, reductionism, emergence, and causation.
Such VMs depend on many hardware and software designs that interact in very complex ways to maintain a network of causal relationships between physical and virtual entities and processes.
I'll try to explain this, and show how VMs are important for philosophy, in part because evolution long ago developed far more sophisticated systems of virtual machinery (e.g. running on brains and their surroundings) than human engineers so far. Most are still not understood.
This partly accounts for the apparent intractability of several philosophical problems.
E.g. running VM subsystems can be disconnected from input-output interactions for extended periods, and some can have more complexity than the available input/output bandwidth can reveal.
Moreover, despite the advantages of VMs for self-monitoring and self control, they can also lead to self-deception.
For a lot of related material see Steve Burbeck's web site http://evolutionofcomputing.org/Multicellular/Emergence.html
(A related presentation debunking the "hard problem" of consciousness is also in this collection.)
Why the "hard" problem of consciousness is easy and the "easy" problem hard....Aaron Sloman
The "hard" problem of concsiousness can be shown to be a non-problem because it is formulated using a seriously defective concept (the concept of "phenomenal consciousness" defined so as to rule out cognitive functionality and causal powers).
So the hard problem is an example of a well known type of philosophical problem that needs to be dissolved (fairly easily) rather than solved. For other examples, and a brief introduction to conceptual analysis, see http://www.cs.bham.ac.uk/research/projects/cogaff/misc/varieties-of-atheism.html
In contrast, the so-called "easy" problem requires detailed analysis of very complex and subtle features of perceptual processes, introspective processes and other mental processes, sometimes labelled "access consciousness": these have cognitive functions, but their complexity (especially the way details change as the environment changes or the perceiver moves) is considerable and very hard to characterise.
"Access consciousness" is complex also because it takes many different forms, since what individuals are conscious of and what uses being conscious of things can be put to, can vary hugely, from simple life forms, through many other animals and human infants, to sophisticated adult humans,
Finding ways of modelling these aspects of consciousness, and explaining how they arise out of physical mechanisms, requires major advances in the science of information processing systems -- including computer science and neuroscience.
There are empirical facts about introspection that have generated theories of consciousness but some of the empirical facts go unnoticed by philosophers.
The notion of a virtual machine is introduced briefly and illustrated using Conway's "Game of life" and other examples of virtual machinery that explain how contents of consciousness can have causal powers and can have intentionality (be able to refer to other things).
The beginnings of a research program are presented, showing how more examples can be collected and how notions of virtual machinery may need to be developed to cope with all the phenomena.
Why symbol-grounding is both impossible and unnecessary, and why theory-tethe...Aaron Sloman
Introduction to key ideas of semantic models,
implicit definitions and symbol tethering, using ideas from philosophy of science and model theoretic semantics to explain why symbol ground theory is misguided: there is no need for all symbols used by an intelligent agent to be 'grounded' in terms of experience, or sensory-motor patterns. Rather, most of the meaning of a symbol may come from its role in a powerful explanatory theory, though the theory should have some connection with experiments and observations in order to be applicable to the world. That is not the same as requiring every symbol to be linked to experiences, experiments or measurements.
Symbol grounding theory is a modern version of the philosophical theory of 'concept empiricism', which was refuted by the philosopher Immanuel Kant in the 18th century.
Do Intelligent Machines, Natural or Artificial, Really Need Emotions?Aaron Sloman
(Updated on 14 Jan 2014 -- with substantial revisions.)
Many people believe that emotions are required for intelligence. I argue that this is mostly based on (a) wishful thinking and (b) a failure adequately to analyse the variety of types of affective states and processes that can arise in different sorts of architectures produced by biological evolution or required for artificial systems. This work is a development of ideas presented by Herbert Simon in the 1960s in his 'Motivational and emotional controls of cognition'.
What is science? (Can There Be a Science of Mind?) (Updated August 2010)Aaron Sloman
This presentation gives an introduction to philosophy of science, though a rather idiosyncratic one, stressing science as the search for powerful new ontologies rather than merely laws. You can't express a law unless you have
an ontology including the items referred to in the law (e.g. pressure, volume, temperature). The talk raises a
number of questions about the aims and methods of science, about the differences between the physical sciences and
the science of information-processing systems (e.g. organisms, minds, computers), whether there is a unique truth
or final answers to be found by science, whether scientists ever prove anything (no -- at most they show that some
theory is better than any currently available rival theory), and why science does not require faith (though
obstinacy can be useful). The slides end with a section on whether a science of mind is possible, answering yes, and explaining how.
Distinguishes Humean (statistics-based) notions of causation and Kantian (deterministic, structure-based) notions of causation, arguing that intelligent robots and animals need both, but each requires a combination of competences, and various kinds of partial competence of both kinds are possible.
What designers of artificial companions need to understand about biological onesAaron Sloman
Discusses some of the requirements for artificial companions (ACs), contrasting relatively easy (Type 1) goals and much harder, but more important, (Type 2) goals for designers of ACs. Current techniques will not lead to achieving Type 2 goals, but progress can be made towards those goals, by looking more closely at how the relevant competences develop in humans, and their architectural and representational requirements. A human child develops a lot of knowledge about space, time, causation, and many varieties of 3-D structures and processes. This knowledge provides the background and foundation for many kinds of learning, and will be required for development of ACs that act sensibly and give sensible advice. No AI systems are anywhere near the competences of young children. Trying to go directly to systems with adult competences, especially statistics based systems, will produce systems that are either very restricted, or very brittle and unreliable (or both).
Kantian Philosophy of Mathematics and Young Robots: Could a baby robot grow u...Aaron Sloman
There is a sequel to this, with more emphasis on 'toddler theorems' and kinds of child science here:
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#toddler
It is not yet stable enough to be uploaded to slideshare.
Evolution of minds and languages: What evolved first and develops first in ch...Aaron Sloman
SLIDESHARE NOW STUPIDLY DOES NOT ALLOW SLIDES TO BE UPDATED. To find the latest version of these slides go to http://www.cs.bham.ac.uk/research/projects/cogaff//talks/#talk111
The version posted here was last updated on 16 March 2015. There have been several changes since then on the alternative site. Why did Slideshare take such a stupid decision (after being bought by Linkedin?)
A theory is presented according to which "languages" with structural variability and compositional semantics evolved in several species for *internal* use (e.g. in perception, planning, learning, forming goals, deciding, etc.) before *external* languages evolved for communication. The theory implies that such internal languages develop in young humans before a language for communication.
It is is also noted that the standard notion of 'compositional semantics' has to allow for the propagation of semantic content from parts to wholes to be potentially context sensitive at every stage: i.e. current context, speaker intentions, user knowledge, shared goals, can all affect how semantics of larger parts are derived from semantics of smaller parts+syntactic structure. This applies as much to non-verbal languages as to verbal ones.
This theory of how human languages evolved from earlier 'internal languages' (GLs) is inconsistent with the best known published theories of evolution or development of language.
But that does not make it wrong. Moreover, this theory is supported by empirical evidence including the example of deaf children in Nicaragua: http://en.wikipedia.org/wiki/Nicaraguan_Sign_Language
BREEDING METHODS FOR DISEASE RESISTANCE.pptxRASHMI M G
Plant breeding for disease resistance is a strategy to reduce crop losses caused by disease. Plants have an innate immune system that allows them to recognize pathogens and provide resistance. However, breeding for long-lasting resistance often involves combining multiple resistance genes
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...Wasswaderrick3
In this book, we use conservation of energy techniques on a fluid element to derive the Modified Bernoulli equation of flow with viscous or friction effects. We derive the general equation of flow/ velocity and then from this we derive the Pouiselle flow equation, the transition flow equation and the turbulent flow equation. In the situations where there are no viscous effects , the equation reduces to the Bernoulli equation. From experimental results, we are able to include other terms in the Bernoulli equation. We also look at cases where pressure gradients exist. We use the Modified Bernoulli equation to derive equations of flow rate for pipes of different cross sectional areas connected together. We also extend our techniques of energy conservation to a sphere falling in a viscous medium under the effect of gravity. We demonstrate Stokes equation of terminal velocity and turbulent flow equation. We look at a way of calculating the time taken for a body to fall in a viscous medium. We also look at the general equation of terminal velocity.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Nucleophilic Addition of carbonyl compounds.pptxSSR02
Nucleophilic addition is the most important reaction of carbonyls. Not just aldehydes and ketones, but also carboxylic acid derivatives in general.
Carbonyls undergo addition reactions with a large range of nucleophiles.
Comparing the relative basicity of the nucleophile and the product is extremely helpful in determining how reversible the addition reaction is. Reactions with Grignards and hydrides are irreversible. Reactions with weak bases like halides and carboxylates generally don’t happen.
Electronic effects (inductive effects, electron donation) have a large impact on reactivity.
Large groups adjacent to the carbonyl will slow the rate of reaction.
Neutral nucleophiles can also add to carbonyls, although their additions are generally slower and more reversible. Acid catalysis is sometimes employed to increase the rate of addition.
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
What's vision for, and how does it work? From Marr (and earlier)to Gibson and Beyond
1. Beyond Gibson (Birmingham Vision club: 17 Jun, Sheffield 23 Jun) 2011
Last Changed (June 10, 2015) – liable to be updated.
What’s vision for, and how does it work?
From Marr (and earlier)
to Gibson and Beyond
(With some potted, rearranged, history)
Aaron Sloman
School of Computer Science, University of Birmingham
http://www.cs.bham.ac.uk/˜axs/
These PDF slides are in my ‘talks’ directory:
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#gibson
And also in slideshare.net http://www.slideshare.net/asloman
A partial list of requirements for visual systems in intelligent animals and machines can be found here (work in
progress): http://www.cs.bham.ac.uk/research/projects/cogaff/misc/vision
An online discussion of Gibson by philosophers may (or may not) be useful to some:
http://theboundsofcognition.blogspot.com/2011/01/noe-on-gibson-on-affordances.html
Seeing possibilities Slide 1 Last revised: June 10, 2015
2. The problem
• Very many researchers assume that it is obvious what vision (e.g. in
humans) is for, i.e. what functions it has, leaving only the problem of
explaining how those functions are fulfilled.
• So they postulate mechanisms and try to show how those mechanisms
can produce the required effects, and also, in some cases, try to show
that those postulated mechanisms exist in humans and other animals
and perform the postulated functions.
• The main point of this presentation is that it is far from obvious what
vision is for – and J.J. Gibson’s main achievement is drawing attention to
some of the functions that other researchers had ignored.
• I’ll present some of the other work, show how Gibson extends and
improves it, and then point out much more there is to the functions of
vision and other forms of perception than even Gibson had noticed.
• In particular, much vision research, unlike Gibson, ignores vision’s
function in on-line control and perception of continuous processes;
and nearly all, including Gibson’s work, ignores meta-cognitive
perception, and perception of possibilities and constraints on
possibilities and the associated role of vision in reasoning.
Seeing possibilities Slide 2 Last revised: June 10, 2015
3. High level outline 1
A short, incomplete, illustrative, history of theories of functions and
mechanisms of vision
• Previous millennia, E.g. Aristotle
• Previous centuries: lots of philosophers, including Hume, Berkeley, Kant, Russell, and
non-philosophers, e.g. (non-philosopher? polymath?) von Helmholtz.
Helmholtz proposes a “sign” theory, according to which sensations symbolize their stimuli, but are not
direct copies of those stimuli. While M¨uller explains the correspondence between sensation and object
by means of an innate configuration of sense nerves, Helmholtz argues that we construct that
correspondence by means of a series of learned, “unconscious inferences.”
http://plato.stanford.edu/entries/hermann-helmholtz
• AI Vision work – 1960s on
Lots of work on finding and representing structure in images (e.g. lines, junctions, and also higher
order, possibly non-continuous structures, e.g. views of occluded objects). (E.g. A. Guzman)
Also work on trying to find 3-D interpretations, using 3-D models (e.g. Roberts, Grape), or using
constraint propagation from image details, e.g. Huffman, Clowes, Winston, Waltz, ...)
Then Barrow and Tenenbaum, on “intrinsic images” (Barrow & Tenenbaum, 1978) (1978), etc. (an idea
that seems to have become popular recently)
Use of soft constraints/relaxation, G. Hinton, use of statistics and neural net inspired mechanisms.
Many ideological battles between researchers whose systems were all very limited, compared with
animal vision.
• Nearly all researchers assumed that the functions of vision were obvious and only
mechanisms/explanations were in dispute. Gibson shattered that.
Seeing possibilities Slide 3 Last revised: June 10, 2015
4. High level outline 2: pre-Marr – 1960s onwards
Lots of image analysis routines
e.g. Azriel Rosenfeld: Many of the algorithms just transformed images
Ideas about pictures having structure, often inspired by Chomsky’s work on language
E.g. S. Kaneff (ed) Picture language machines (1970)
Analysis by synthesis/Hierarchical synthesis
Ulric Neisser Cognitive Psychology, 1967 (parallel top-down and bottom up processing using models)
Oliver Selfridge, PANDEMONIUM (partly neurally inspired?)
Model-based vision research on polyhedra
Roberts, Grape, (particular polyhedral models);
Clowes, Huffman, (model fragments: edges, faces, vertices);
Later work - Marr/Nishihara generalised cylinders; Biederman geons
More recent, more systematic work on polyhedra by Ralph Martin’s group (Cardiff).
http://ralph.cs.cf.ac.uk/Data/Sketch.html
Use of “expert systems” techniques to analyse pictures Hanson and Riseman (UMASS)
Structure from:
Stereo (many people – badly influenced by results from random dot stereograms (Julesz)
Motion (Longuet-Higgins, Clocksin, Ullman, Spacek ....)
Intensity/shading (Horn, ...)
Texture and optical flow (Gibson)
More general relations between image fragments and scene fragments
Barrow and Tennenbaum “Recovering intrinsic scene characteristics from images” (1978)
Parallel work on pattern recognition, from earliest times (often disparaged by AI people)
Seeing possibilities Slide 4 Last revised: June 10, 2015
5. High level outline 3: Marr 1
David Marr: papers at MIT in late 1970s, died 1981, posthumous book 1982
Died tragically young.
Enormously influential, partly for bad reasons, though very clever.
Claimed previous AI vision research was all wrong because it used artificial images (e.g. line drawings full
of ambiguities) and vision doesn’t require sophisticated information processing architectures, just a
one-way pipeline of processing stages ending with model construction (using generalised cylinders).
Convinced many people that vision is (merely?):
a process of producing 3-D descriptions of shape, size, distance, orientation, etc, from 2-D data
“reversing” the production of an image by projection.
Went against his own anti-top-down approach by trying to fit scene structures to generalised cylinders.
Stressed three levels of theory (causing much confusion):
(1) Computational, (2) Algorithmic, (3) Implementational.
In fact, this is mainly a confused way of introducing the old engineering distinctions:
(1) Requirements analysis (What’s X for?)
(2) Design (What are the high level design features for systems that meet the requirements?)
(3) Implementation (How can the design be implemented in lower-level mechanisms, physical, electronic,
physiological, or computational.)
All of those can have different levels – e.g. implementation in virtual machines, implemented in lower level
virtual machines, .... implemented in physical machines.
Marr’s Levels were badly named and far too widely accepted as important, by people without engineering
experience.
For a critique by McClamrock (1991) see http://www.albany.edu/˜ron/papers/marrlevl.html
Seeing possibilities Slide 5 Last revised: June 10, 2015
6. Marr 2
Some of his main points:
• Reject artificial images – e.g. line drawings used as test images.
– use natural images (actual photographs of real 3-D objects)
– rich in data, making tasks easier (??) [Illustrated using teddy bear photo]
He discounted the possibility of informed selection of artificial images to study well-defined problems.
• Processing pipeline: primal sketch ⇒ 2.5D sketch ⇒ 3-D interpretations
• Use of generalised cylinders (Compare Biederman’s geons)
Generalised cylinders proved unsuitable as models for many objects.
• The function of vision is to produce descriptions/representations of what’s out there:
3-D geometry, distance, surface orientations, curvature, textures, spatial relationships, colours(?).
• He shared the common assumption that metrical (Euclidean) coordinate frames are
required.
But he allowed that different frames of reference can be used for scene descriptions
– Scene centred (Use a global coordinate system for everything visible in the scene)
– Object centred (Attach coordinate systems to objects, or object parts)
– Viewer centred
o egocentric (Represent scene objects in terms of relationships with perceiver).
o allocentric (Represent scene objects in terms of relationships with another viewer).
(I am not sure Marr made this distinction. Others have.)
Seeing possibilities Slide 6 Last revised: June 10, 2015
7. Marr 3: functions of vision
“... the quintessential fact of human vision – that it tells about shape and
space and spatial arrangement”.
Comments:
(a) This ignores the functions of vision in on-line control of behaviours:
e.g. visual servoing while grasping, moving, avoiding obstacles –
or assumes (wrongly) that these control functions can all be subsumed under the descriptive functions.
Compare: (Gibson, 1966) (Sloman, 1983)
(b) Marr’s view fits the common idea of vision as “reversing” the projection process – using
information in the optic array (or image) to construct a 3-D model of what’s in the scene.
But that common idea is mistaken: visual systems do not represent information about 3-D
structure in a 3-D model (information structure isomorphic with things represented)
but in a collection of information fragments, all giving partial information.
A model cannot be inconsistent.
A collection of partial descriptions can.
See pictures of impossible 3-D objects by Reutersvard (in later slides), Penrose and Escher.
The way we see such pictures would be impossible if certain views of 3-D vision were correct
(percepts as 3-D models isomorphic with the objects perceived: are impossible for percepts of
impossible objects – the models, and therefore the percepts would be impossible too).
NOTE:
Marr’s emphasis on spatial structure made it awkward for him to talk about colour vision.
Seeing possibilities Slide 7 Last revised: June 10, 2015
8. Marr 4: A hint of a move towards Gibson’s ideas.
On p.31 of Vision, he wrote “Vision is a process that produces from images of the external
world a description that is useful to the viewer
(That’s Gibsonian)
and not cluttered with irrelevant information.”
Not cluttered? My visual contents often are!
Also:
p. 32 “Vision is used in such a bewildering variety of ways that the visual systems of
animals must differ significantly from one another”.
“For a fly the information obtained is mainly subjective...”
(Mainly concerned with image contents?? Or something like affordances for the fly??)
Seeing possibilities Slide 8 Last revised: June 10, 2015
9. Moving beyond Marr and predecessors
Marr:
“... the quintessential fact of human vision – that it tells about shape and space and
spatial arrangement”.
What else could there be, besides shape and space and spatial arrangement?
Lots!
Gibson noted some of it.
(Starting in the 1960s (Gibson, 1966).)
There is a useful but brief discussion of Gibson’s ideas and how they relate to AI in section 7.v of
(Boden, 2006), pp. 465–472.
Philosophers who are ignorant of that work by Gibson think the ideas came up much later,
when philosophers and others started discussing embodied cognition, around 20 years
after Gibson’s first book.
Some of the ideas were also in (Simon, 1969).
Unfortunately, some philosophers teach their students to think of a philosophy as a subject that can be
pursued in ignorance of science because philosophy studies “purely conceptual” problems, or something
like that.
Seeing possibilities Slide 9 Last revised: June 10, 2015
10. Moving Beyond Marr and predecessors 2
Are visual contents spatial?
Yes – but only some of them.
Contrasting two sorts of ambiguous image makes this clear:
Ask what changes in what you see:
In some cases it is shape, and space and spatial arrangement, as Marr claimed.
But not in all cases (Sloman, 2001):
Everything that changes in the ”Necker flip” is spatial: orderings, directions, relative distances, orientations.
In the duck-rabbit, however, there is no geometric flip: the change is much more abstract and involves both
changes in how parts are identified (e.g. ears vs bill) and more abstract notions like “facing this way”, “facing
that way” which presupposes that other organisms can be seen as perceivers and potential movers.
That’s the basis of vicarious affordances and social affordances.
We can see some things as more than physical objects.
It’s perception, not cognition because of how the perceptual details are in registration with the optic array.
Gibson noticed other abstract features of visual contents: features related to possible actions.
Seeing possibilities Slide 10 Last revised: June 10, 2015
11. Visual contents can be very abstract, non-physical
Do the eyes look different? Why? How?
Compare illusory contours – Kanizsa
http://en.wikipedia.org/wiki/Illusory contours
Problems
• What sorts of non-shape (including non-physical) information can a visual system get
from the environment? (intention, mood, emotion, effort... compare Johansson movies)
• Why are these kinds of information useful – to what sorts of agents?
• How is the information represented in the perceiver?
• What mechanisms derive it from what information?
• What are the roles of evolution and development/learning in the creation/modification of
such mechanisms?
• How do these mechanisms and forms of representation interact with others?
• How do the interactions help the perceiver, or the group.
The eyes are geometrically identical.
Seeing possibilities Slide 11 Last revised: June 10, 2015
12. Beyond sensorymotor contents
There are many philosophers and scientists who assume, like the old
empiricist philosophers, that all semantic contents must be derived from
experience, i.e. that all concepts must be defined in terms of processes of
abstraction from sensory contents, or sensory-motor contents.
This “symbol–grounding theory” is a reinvention of “concept empiricism”.
• However Kant argued in (Kant, 1781) that it is impossible to have experiences at all
without having some concepts initially, e.g. of spatial structure and relations of
adjacency, containment, etc.
• Then 20th Century Philosophers (Carnap, Nagel, Pap, and others) showed that many of
the concepts required for scientific theories (e.g. atom, charge on an electron, gene,
valence) cannot be derived from experience of instances.
• Alternative to symbol-grounding theory: some concepts are implicitly defined by their
roles in deep explanatory theories, where the structure of the theory determines the
possible models of the theory.
• In order to rule out unwanted models, “bridging rules” can be used, e.g. specifying how
to measure values corresponding to certain concepts (e.g. “charge on an electron”) and
how to make predictions. (Carnap, 1947)
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#models
Seeing possibilities Slide 12 Last revised: June 10, 2015
13. Beyond sensorymotor contents
The existence of ambiguous figures shows clearly that the sensory (e.g. retinal) contents
do not uniquely determine what concepts are used in a percept.
Moreover in cases like the necker cube, what changes when the percept flips is 3-D
structure.
The concepts of 3-D structure cannot be defined in terms of concepts of patterns in 2-D
structures or 2-D structures plus motor signals.
Going from retinal contents to 3-D percepts involves creative interpretation that goes
beyond the data.
Likewise going from retinal contents to percepts of animal parts and the perception of an
animal facing one way rather than another is not just a matter of grouping or abstracting
sensory or sensory-motor contents, or storing 2-D or 3-D structural information.
See also (Sloman, 2009c) and
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#models
The point is also illustrated by affordances: perceived possibilities and constraints, as shown below.
Seeing possibilities Slide 13 Last revised: June 10, 2015
14. 3-D structures – and possible actions
Here (on the right) is part of a picture by Swedish artist,
Oscar Reutersv¨ard (1934) which you probably see as a
configuration of coloured cubes.
As with the Necker cube you have experiences of both 2-D
lines, regions, colours, relationships and also 3-D surfaces,
edges, corners, and spatial relationships.
You probably also experience various affordances: places you could touch the surfaces, ways you could grasp
and move the various cubes (perhaps some are held floating in place by magnetic fields).
E.g. you can probably imagine swapping two of them, thinking about how you would have to grasp them in the
process – e.g. swapping the white one with the cube to the left of it, or the cube on the opposite side.
Seeing possibilities Slide 14 Last revised: June 10, 2015
15. Experienced Possibilities, and representations
All of this suggests that much of what is experienced visually is collections
of possibilities for change and constraints on change –
where these possibilities are attached to various fragments of the scene (via fragments of
the image).
I.e. the possibilities and the constraints (e.g. obstacles) are located in 3-D space, in the
environment, just as the objects and object fragments are.
Question: how are all these information fragments represented?
• Anyone who thinks it is possible for organisms or robots to do without representations
must think it is possible for them to do without information.
• For information must somehow be encoded and whatever encodes the information is a
representation – which could include a physical or virtual structure and much of its
context.
• The representations used may be physical structures or structures in virtual machines –
like the information a chess program has about the current state of the board, and the
information about possible moves and their consequences.
Seeing possibilities Slide 15 Last revised: June 10, 2015
16. 2-D and 3-D Qualia: back to Reutersvard
Here (on the right) is part of a picture by Swedish artist,
Oscar Reutersv¨ard (1934) which you probably see as a
configuration of coloured cubes.
As with the Necker cube you have experiences of both 2-D
lines, regions, colours, relationships and also 3-D surfaces,
edges, corners, and spatial relationships.
You probably also experience various affordances: places you could touch the surfaces, ways you could grasp
and move the various cubes (perhaps some are held floating in place by magnetic fields).
E.g. you can probably imagine swapping two of them, thinking about how you would have to grasp them in the
process – e.g. swapping the white one with the cube to the left of it, or the cube on the opposite side.
The second picture on the right (from which the first one was
extracted) has a richer set of 2-D and 3-D contents.
Again there is a collection of 2-D contents (e.g. a star in the middle),
plus experience of 3-D structures, relationships and affordances: with
new possibilities for touching surfaces, grasping cubes, moving cubes.
The picture is outside you, as would the cubes be if it were not a picture.
But the contents of your experience are in you: a multi-layered set of
qualia: 2-D, 3-D and process possibilities.
But the scene depicted in the lower picture is geometrically impossible,
even though the 2-D configuration is possible and exists, on the screen
or on paper, if printed: the cubes, however, could not exist as shown.
So your qualia can be inconsistent!
That’s impossible according to some theories of consciousness.
They can also go unnoticed: See http://tinyurl.com/uncsee
Seeing possibilities Slide 16 Last revised: June 10, 2015
17. There are many sorts of things humans can see
besides geometrical properties:
• that one object is supported by another,
• that one object constrains motion of another (e.g. a window catch),
• that something is flexible or fragile,
• which parts of an organism are ears, eyes, mouth, bill, etc.,
• which way something is facing,
• what action some person or animal is about to perform (throw, jump, run, etc.),
• whether an action is dangerous,
• whether someone is happy, sad, angry, etc.,
• whether a painting is in the style of Picasso...
• what information is available about X
• which changes in the scene or changes of viewpoint will alter available information
about X.
and other epistemic affordances.
Reading music or text fluently are important special cases, illustrating how learning/training can extend a
visual architecture. (Sloman, 1978, Ch.9)
Seeing possibilities Slide 17 Last revised: June 10, 2015
18. Attaching affordance information to scene fragments
I have argued that a lot of the content of visual perception (at least in adult
humans) takes the form of information about what is and is not possible
(possibilities and constraints) in various locations in the perceived scene –
along with realised possibilities when processes are perceived
(e.g. objects or surfaces moving, rotating, deforming, interacting).
• The Reutersvard image shows how quite rich collections of possibilities can be
“attached” to various scene fragments,
• Where some of the information is about possibilities and conditionals:
– X is possible, Y is possible, Z is possible
– if X occurs then, ...
– if Y occurs then, ...
– if Z occurs then, ...
• including “epistemic affordances”
(if motion M occurs, then information I will become available/or become inaccessible).
• This can be seen as (yet another) generalisation of aspect graphs which encode only
information about how changes of viewpoint (or rotations of a perceived object) will
change what is and is not visible.
http://people.csail.mit.edu/bmcutler/6.838/project/aspect graph.html#1
• An important special case of this idea concerns the effects of different kinds of material
on what would happen if various things were to happen:
e.g. material being: rigid, plastic, elastic, liquid, viscous, sticky, etc.
See http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#babystuff
The idea of generalising the concept of “aspect graph” as suggested here arose in a discussion with Jeremy
Wyatt several years ago.
Seeing possibilities Slide 18 Last revised: June 10, 2015
19. J. J. Gibson started a revolution
The Senses Considered as Perceptual Systems, 1966,
The Ecological Approach to Visual Perception, (Gibson, 1979).
For organisms the function of vision (more generally perception)
– is not to describe some objective external reality
– but to serve biological needs
in importantly different ways for different organisms.
The different functions arose at different stages in biological evolution, as both the physical
environment became more complex and more structured, and the organisms and their
predators, prey, mates, and offspring became more complex.
Some of the consequences were physical – development of independently movable
manipulators: hands.
That had implications for information processing requirements.
E.g. use of an exosomatic rather than a somatic ontology.
Somatic = Marr’s ’subjective’ ??
Seeing possibilities Slide 19 Last revised: June 10, 2015
20. Gibson’s Revolution
The Ecological Approach to Visual Perception, 1979
For organisms the function of vision (more generally perception) is not to
describe some objective external reality but to serve biological needs
• Providing information about positive and negative affordances (what the animal can and
cannot do in a situation, given its body, motor capabilities, and possible goals)
• Using invariants in static and changing optic arrays
texture gradients, optical flow patterns, contrast edges, “common fate”.
• Using actions to probe the environment, e.g. so as to change contents of the optic array
The sensors and effectors work together to form “perceptual systems”(Gibson, 1966)
(compare “active vision”)
Vision is highly viewer-centred and action-centred.
The crazy bits in Gibson’s theories:
• There are no internal representations, no reasoning.
• Perception is immediate and direct (“pickup”, not “interpretation”)
Also a brilliant idea:
The idea that the input to vision processing is not retinal images but an optic array, whose
changes are systematically coupled to various kinds of actions, was a brilliant move:
The retina is mainly a device for sampling optic arrays: cones of information converging
on viewpoints.
Perhaps in combination with brain area V1?
Seeing possibilities Slide 20 Last revised: June 10, 2015
21. The problem with James Gibson
Although he continually emphasised information, e.g. the information available in static
and changing optic arrays, he denied the need for representations or information
processing (computation) using a mysterious concept of “direct pickup” instead.
He provided many important insights regarding interactions between vision and action, and
the episodic information in vision, but ignored other roles for vision e.g.
• multi-step planning,
If I move the pile of bricks to the right, and push that chair against the bookcase, and stand on it, I’ll be
able to reach the top shelf.
• seeking explanations
Can marks on the road and features of the impact suggest why the car crashed into the lamp post?
• understanding causation
(apart from immediate perception of causation, as in Michotte’s experiments)
If this string is pulled down, that pulley will rotate clockwise, causing that gear to turn, and ....
• geometric reasoning
The line from any vertex of any triangle to the middle of the opposite side produces two smaller
triangles of the same area, even when the shapes are different.
• Design of new machines, tools, functional artifacts (e.g. door-handles).
• Perceiving intentional actions. Fred is looking for something.
Contrast Eleanor J. Gibson and Anne D. Pick, An Ecological Approach to Perceptual Learning and
Development, OUP, 2000.
They allow for the development of a wider range of cognitive competences using vision.
But even they don’t allow for learning by working things out.
Seeing possibilities Slide 21 Last revised: June 10, 2015
22. Example: Perceiving causation
Often our ability to perceive causal connections is used in humour:
From a book of “French Cartoons”.
Some cartoons depend on our ability to see processes of various sorts.
In the above it’s a past process reaching into the present.
Some cartoons present a future process extending from the present –
e.g. a pompous pontificator heading unwittingly for a fall, collision, come-uppance, etc.
Seeing possibilities Slide 22 Last revised: June 10, 2015
23. All organisms are information-processors
but the information to be processed has changed
and so have the means
From microbes to hook-making crows:
How many transitions in information-processing powers were required?
Contrast these transitions:
• transitions in physical shape, size and sensory motor subsystems
• transitions in information processing capabilities.
Fossil records don’t necessarily provide clues.
http://users.ox.ac.uk/˜kgroup/tools/crow photos.shtml
http://users.ox.ac.uk/˜kgroup/tools/movies.shtml
Seeing possibilities Slide 23 Last revised: June 10, 2015
24. Environments have agent-relative structure
The environments in which animals evolve, develop, compete, and
reproduce, vary widely in their information-processing requirements and
opportunities.
If we ignore that environmental richness and diversity, our theories will be
shallow and of limited use.
In simple environments everything can in principle be represented numerically, e.g. using numbers for
location coordinates, orientations, velocity, size, distances, etc.
In practice the optic cone may lack the required clarity and detail; but for many purposes it may suffice to
perceive partial orderings and topological relationships, as illustrated in
http://www.cs.bham.ac.uk/research/projects/cosy/papers/
changing-affordances.html
In more complex environment things to be represented include:
• Structures and structural relationships, e.g. what is inside, adjacent to, connected with,
flush with, in line with, obstructing, supporting...
• Different sorts of processes, e.g. bending, twisting, flowing, pouring, scratching,
rubbing, stretching, being compressed.
• Plans for future actions in which locations and arrangements and combinations of
things are altered (e.g. while building a shelter).
• Intentions and actions of others.
• Past and future events and generalisations.
How can all those be represented? How can the information be used?
Seeing possibilities Slide 24 Last revised: June 10, 2015
25. Varied environments produce varied demands
Types of environment with different information-processing requirements
• Chemical soup
• Soup with detectable gradients
• Soup plus some stable structures (places with good stuff, bad stuff, obstacles, supports,
shelters – requiring enduring location maps.)
• Environments that record who moved where (permitting stigmergy).
• Things that have to be manipulated to be eaten (disassembled)
• Controllable manipulators
• Things that try to eat you
• Food that tries to escape
• Mates with preferences
• Competitors for food and mates
• Collaborators that need, or can supply, information.
• and so on .....
How do the information-processing requirements
change across these cases?
For more on evolutionary transitions see
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#talk89
Genomes for self-constructing, self-modifying information-processing architectures
Seeing possibilities Slide 25 Last revised: June 10, 2015
26. Beyond affordances and invariants
• Vision does not just have one function, but many, and the functions are extendable
through learning and development – building extensions to the architecture.
E.g. in humans: reading text, music, logic, computer programs, seeing functional relations,
understanding other minds, ....
• Vision deals with multiple ontologies
• Vision is not just about what’s there but (as Gibson says) also about what can happen
• But what can happen need not be caused by or relevant to the viewer’s goals or actions
Trees waving in the breeze, clouds moving in the sky, shadows moving on the ground, leaves blown by
the wind.
• Besides action affordances there are also epistemic affordances concerned with
availability of information. (Compare (Natsoulas, 1984))
• Besides affordances for the viewer some animals can see vicarious affordances, i.e.
affordances for others
including predators, prey, potential mates, infants who need protection, etc.
• Seeing structures, relationships, processes, and causal interactions (or fragments
thereof) not relevant the goals, needs, actions, etc. of the viewer can make it possible to
do novel things in future, by combining old competences.
Great economies and power introduced by using an ontology that is exosomatic, amodal,
viewer-neutral. (Still missing from current robots?)
• Functions of vision for other organisms may not be obvious:
e.g. Portia Spider. (Tarsitano, 2006) http://dx.doi.org/10.1016/j.anbehav.2006.05.007
Seeing possibilities Slide 26 Last revised: June 10, 2015
27. Beyond JJG: Generalised Gibsonianism (GG)
Less constrained analysis of the many functions of vision,
including roles in mathematical reasoning (geometrical, topological, logical, algebraic),
and various roles in a robot capable of seeing and manipulating 3-d structures
leads to an extension of Gibson’s theories,
while accepting his rejection of the naive view (e.g. Marr) that the function of vision is only
to provide information about what objectively exists in the environment.
In particular, don’t expect one set of functions to be common to all animals that use vision.
Many species use vision only for the online control of behaviour,
using many changes in features of optic array, and correlations of those changes with actions, to provide
information about what can be or should be done immediately (e.g. the need to decelerate to avoid hard
impact, the need to swerve to avoid an obstacle, the possibility of reaching forward to grasp something).
In contrast, humans (though not necessarily newborn infants) and possibly some other
species, use vision for other functions that go beyond Gibson’s functions.
Moreover, in order to cope with novel structures, processes, goals and actions, some
animals need vision to provide lower level information than affordances, information that is
potentially shareable between different affordances: “proto-affordances”.
Information about what changes are possible in a situation, and what constraints there are
relating changes – independently of particular actions. (Sloman, 1996)
E.g. these two surface fragments could move closer; this surface fragment will obstruct that motion, that
rolling ball cannot fit through that gap....
Seeing possibilities Slide 27 Last revised: June 10, 2015
28. Beyond behavioural expertise
Karmiloff-Smith has drawn attention to the fact that once behavioural
expertise in some domain has been achieved (e.g. ability to move at varying
speeds, keeping upright on rough ground, avoiding obstacles, going through
gaps, keeping up with others, etc.) there are sometimes new competences
to be acquired, which she describes as involving “Representational
Redescription”.
Example: after learning how to run with the herd some animals may develop the ability to
remember what they did and did not do, and why they failed to do certain things. They
might also be able not only to behave but also to reason about possible behaviours and
their consequences prior to producing those behaviours (and often a necessary
prerequisite for producing sensible behaviour).
See (Karmiloff-Smith, 1992) for more on varieties of representational redescription and their
consequences.
I suspect that in some cases it’s a change in architecture rather than a change in
representation that’s important.
I have tried to bring out some of the implications of the book here
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/beyond-modularity.html
(Work in progress.)
Many researchers in robotics and cognition, especially embodied cognition, ignore
differences between behavioural expertise and additional requirements such as knowing
why you did not do something, and what would have happened if you had done it.
Seeing possibilities Slide 28 Last revised: June 10, 2015
29. The ability to see possible changes
Seeing simple proto-affordances involves seeing what processes are and
are not possible in a situation.
Seeing compound proto-affordances involves seeing what (serial or parallel)
combinations of processes are possible and what constraints exist at
different stages in those combinations.
In each of these four configurations,
consider the question:
Is it possible to slide the rod with a blue end
from the “start” position to the “finish”
position within the square, given that the
grey portion is rigid and impenetrable?
Other questions you could ask include
If it is possible, how many significantly
different ways of doing it are there?
(Based on a similar idea by Jeremy Wyatt)
This task uses the ability to detect the
possibility of movements that are not
happening,
and the constraints limiting such movements,
and to visualise combinations of such movements, while inspecting them for consequences:
Using what brain mechanisms?
Seeing possibilities Slide 29 Last revised: June 10, 2015
30. Connections with evolution of mathematical competences
Long before there were mathematics teachers, our ancestors must have begun to notice
facts about geometrical shapes that were later codified in Euclidean geometry.
Long before forms of logical reasoning powerful enough to serve the purposes of
mathematicians were discovered/created by Frege, Russell and others in the 19th century,
mathematicians were making discoveries and proving them, using their ability to notice
possibilities for change, and invariants across change, in geometrical configurations.
I’ve discussed examples in (Sloman, 1968/9, 1971, 1978, 1996, 2010a)
If my generalisations of Gibson’s notion of “perception of affordance” to cover a very much
wider variety of perceptions of possibilities for change and constraints on change than he
mentioned, are correct, then we can see how some of the roots of mathematical cognition
as demonstrated in the discovery and proof of theorems in Euclidean geometry and
topology may have developed from ancient animal abilities to perceive and reason about
affordances required for selecting complex goals and plans.
Some fragments of Euclidean geometry concerned with possibilities are discussed here:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/triangle-theorem.html
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/triangle-sum.html
Such animal capabilities were discussed by Kenneth Craik in (Craik, 1943).
I have tried to illustrate these capabilities in the processes of discovery of “toddler theorems” in young
children, in (Sloman, 2008d).
For more information about toddler theorems and their relationships to Annette Karmiloff-Smith’s ideas
about “Representational Redescription” (Karmiloff-Smith, 1992) see
http://tinyurl.com/BhamCog/misc/toddler-theorems.html
Seeing possibilities Slide 30 Last revised: June 10, 2015
31. Multi-strand relationships and processes
I discussed some of the representational problems arising from complexities
of perception of processes at a conference in 1982 (Sloman, 1983).
In particular:
• Perceived objects whose parts move continuously can change their shapes, increasing
or decreasing the numbers of features – e.g. a line growing a blob, a blob growing an
appendage, two shapes merging, a moving string acquiring new changes of curvature,
new inflection points, new contact points, or new junctions or knots.
• When two complex objects with parts are perceived together (e.g. two hands), not only
are the wholes related but also the parts: they are related both within and across
objects, e.g. a part of one object aligning with a part of another, and the relations can be
of different sorts – metrical, topological, causal, etc.: “Multi-strand relationships”.
• When objects exhibiting multi-strand relationships move or change shape, several of the
pre-existing relationships can change in parallel: “Multi-strand processes”.
• Perceiving and understanding multi-strand relationships and processes is required for
many physical actions, e.g. washing up cups, saucers, bowls and cutlery, in a bowl of
water using a dishmop, dressing or un-dressing a child, reversing a car into a narrow
parking space, and many more.
• Description logics can be used to express static multi-strand relationships, but it is not
clear whether they are useful for perception of complex multi-strand processes in which
collections of relationships alter continuously or go in and out of existence (e.g. in a
ballet performance, or rapids in a river).
Seeing possibilities Slide 31 Last revised: June 10, 2015
32. Constraints on mechanisms
The problem of speed
Some experiments demonstrated here
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#talk88
show the speed with which processing of an entirely new image can
propagate to high level perceptual mechanisms – in humans.
The speed is hard to explain.
A possible explanation might refer to interactions between various sorts of dynamical
system, processing interpretations at different levels of abstraction in parallel – as
illustrated briefly in the following slides.
I conjecture that much visual learning (in humans and other species) involves growing new
dynamical systems that lie dormant much of the time but can be very rapidly activated by
information propagated upward, sideways or downward in the architecture –
in some cases triggering rapid actions (e.g. innate or learnt reflexes).
Competition and cooperation within and between such systems can cause rapid
convergence to a particular combination of active sub-systems (multiple cooperating
dynamical systems), whose continued behaviour is then driven partly by current changing
perceptual inputs – when looking at processes in complex, changing scenes.
Sometimes there is much slower convergence, or non-convergence, in difficult situations:
e.g. poor light, things moving too fast, blurred vision.
This extends the ideas demonstrated in the POPEYE program reported in (Sloman, 1978, Chapter 9).
Seeing possibilities Slide 32 Last revised: June 10, 2015
33. Dynamical systems I
Many researchers who emphasise the importance of embodiment, also
emphasise dynamical systems —
Especially dynamical systems closely coupled with the environment –
Where the nature of the coupling depends on the agent’s morphology and
sensory motor systems.
Seeing possibilities Slide 33 Last revised: June 10, 2015
34. Dynamical systems II
A more complex dynamical system may involve large numbers of sub-systems many of
which are dormant most of the time but can be re-activated as needed.
E.g. Perceiving, adaptation, learning, acting, self-monitoring can all involve information processed at multiple
levels of abstraction.
Hypothetical reasoning: Science, mathematics, philosophy...
Some of the more abstract processes may be totally decoupled from the environment – but may produce
results that are stored in case they are useful...
Which species can do what? – What are intermediate stages:
– in evolution?
– in individual development?
Do not believe symbol-grounding theory: use theory-tethering instead.
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#models
(Label proposed by Jackie Chappell).
Seeing possibilities Slide 34 Last revised: June 10, 2015
35. Further Issues That Need To Be Addressed
There’s lots more to be said about the following issues, some discussed in online papers:
• Partial list of requirements for visual systems in intelligent animals and machines (many
still not met): http://www.cs.bham.ac.uk/research/projects/cogaff/misc/vision
• Alternatives to use of global metrical coordinate systems for spatial structures and
relationships: e.g. use networks of partial orderings (Contrast (Johansson, 2008))
• Alternatives to the use of probability distributions to cope with noise and uncertainty,
e.g. use more abstract representations that capture only what is certain about structures, relationships
and processes, e.g. X is getting closer to Y.
http://www.cs.bham.ac.uk/research/projects/cosy/papers/#dp0702
Predicting affordance changes.
• Perception of processes of many kinds including not just spatial change, but also causal
and epistemic changes, and intentional actions.
• The importance of perception of multi-strand relationships (not only objects have
perceivable relationships, but also parts and fragments of those objects).
• Perception of multi-strand processes – processes in which there are multiple concurrent
sub-processes, e.g. changing relationships.
• Why the idea that symbols need “grounding” is totally misconceived
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#models
• using a perceptual ontology that includes kinds of stuff.
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#babystuff
• How changes in ontologies, forms of representation, architectures, and uses of visual
information, occur during evolution, development and learning.
Seeing possibilities Slide 35 Last revised: June 10, 2015
36. Philosophy is one of the relevant disciplines
For example
• Conceptual analysis
• Metaphysics
• Kinds of meaning – semantics, ontologies
• What sorts of things can biological evolution do (in principle)?
• What is a form of representation?
Donald Peterson, Ed., Forms of representation: an interdisciplinary theme for cognitive science Intellect
Books, 1996,
• What forms of representation can a genome use?
• Kinds of machinery : e.g. physical, virtual
• Kinds of causation: e.g. in virtual machinery
NOTE: The ideas presented here are partly summarised and combined with a discussion of
evolution of language in
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#talk111
Talk 111: Two Related Themes (intertwined)
What are the functions of vision?
How did human language evolve?
(Languages are needed for internal information processing, including visual processing)
Seeing possibilities Slide 36 Last revised: June 10, 2015
37. Summary of Main Points
Much vision research, unlike Gibson’s, ignores functions of vision in
on-line control and perception of continuous processes.
Work in robotics has begun to address small subsets of this.
Much vision research, including Gibson’s, ignores
• meta-cognitive perception (requiring meta-semantic competences)
(e.g. perceiving intentions, mood, how others feel, what they are trying to do, what they are
trying to communicate, etc., where meta-semantic concepts are part of the ontology required)
• perception of possibilities and constraints on possibilities, including many kinds
of affordance besides affordances for immediate action by the perceiver,
including proto-affordances, vicarious affordances, auto-vicarious affordances, epistemic
affordances, multi-step branching affordances, geometric and topological constraints, ...
• the associated role of vision in reasoning and doing mathematics,
e.g. using actual or imagined diagrams, or visualising possible changes and their
consequences, in a perceived machine or other complex structure.
• The visual mechanisms that allow cognitive contents to be kept in registration
with the optic array ((Trehub, 1991) attempts to address this)
Not all organisms with visual systems have all these capabilities: identifying subsets and
how they evolved is a major research problem – part of the Meta-Morphogensis project:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/meta-morphogenesis.html
Even young humans develop them piece-meal in ways that depend in part on their physical
and social environment. See (Karmiloff-Smith, 1992; Sloman, 2008d; Chappell & Sloman, 2007)
NOTE: Proto-affordances were also recognized in (Siegel, 2014), and the same label proposed.
Seeing possibilities Slide 37 Last revised: June 10, 2015
38. Related work (a small subset)
References
Barrow, H., & Tenenbaum, J. (1978). Recovering intrinsic scene characteristics from images. In A. Hanson & E. Riseman (Eds.), Computer vision systems
(pp. 3–26). New York: Academic Press.
Boden, M. A. (2006). Mind As Machine: A history of Cognitive Science (Vols 1–2). Oxford: Oxford University Press.
Carnap, R. (1947). Meaning and necessity: a study in semantics and modal logic. Chicago: Chicago University Press.
Chappell, J., & Sloman, A. (2007). Natural and artificial meta-configured altricial information-processing systems. International Journal of Unconventional
Computing, 3(3), 211–239. Available from http://www.cs.bham.ac.uk/research/projects/cogaff/07.html#717
Craik, K. (1943). The nature of explanation. London, New York: Cambridge University Press.
Gibson, J. J. (1966). The Senses Considered as Perceptual Systems. Boston: Houghton Mifflin.
Gibson, J. J. (1979). The ecological approach to visual perception. Boston, MA: Houghton Mifflin.
Johansson, I. (2008). Functions and Shapes in the Light of the International System of Units. Int Ontology Metaphysics, 9, 93–117. Available from
DOI10.1007/s12133-008-0025-z
Kant, I. (1781). Critique of pure reason. London: Macmillan. (Translated (1929) by Norman Kemp Smith)
Karmiloff-Smith, A. (1992). Beyond Modularity: A Developmental Perspective on Cognitive Science. Cambridge, MA: MIT Press.
Natsoulas, T. (1984, July). Towards the Improvement of Gibsonian Perception Theory. J. for the Theory of Social Behaviour, 14(2), 231–258. Available from
DOI:10.1111/j.1468-5914.1984.tb00496.x
Siegel, S. (2014). Affordances and the Contents of Perception. In B. Brogaard (Ed.), Does Perception Have Content? (pp. 39–76). USA: OUP. Available from
http://philpapers.org/rec/SIEAAT
Simon, H. A. (1969). The sciences of the artificial. Cambridge, MA: MIT Press. ((Second edition 1981))
Sloman, A. (1968/9). Explaining Logical Necessity. Proceedings of the Aristotelian Society, 69, 33–50. Available from
http://www.cs.bham.ac.uk/research/projects/cogaff/62-80.html#1968-01
Sloman, A. (1971). Interactions between philosophy and AI: The role of intuition and non-logical reasoning in intelligence. In Proc 2nd ijcai (pp. 209–226).
London: William Kaufmann. Available from http://www.cs.bham.ac.uk/research/cogaff/62-80.html#1971-02 (Reprinted in Artificial
Intelligence, vol 2, 3-4, pp 209-225, 1971)
Sloman, A. (1978). The computer revolution in philosophy. Hassocks, Sussex: Harvester Press (and Humanities Press). Available from
http://www.cs.bham.ac.uk/research/cogaff/62-80.html#crp
Sloman, A. (1983). Image interpretation: The way ahead? In O. Braddick & A. Sleigh. (Eds.), Physical and Biological Processing of Images (Proceedings of
an international symposium organised by The Rank Prize Funds, London, 1982.) (pp. 380–401). Berlin: Springer-Verlag. Available from
http://www.cs.bham.ac.uk/research/projects/cogaff/06.html#0604
Sloman, A. (1989). On designing a visual system (towards a gibsonian computational model of vision). Journal of Experimental and Theoretical AI, 1(4),
289–337. Available from http://www.cs.bham.ac.uk/research/projects/cogaff/81-95.html#7
Sloman, A. (1993). The mind as a control system. In C. Hookway & D. Peterson (Eds.), Philosophy and the cognitive sciences (pp. 69–110). Cambridge, UK:
Cambridge University Press. Available from http://www.cs.bham.ac.uk/research/projects/cogaff/81-95.html#18
Sloman, A. (1994). How to design a visual system – Gibson remembered. In D.Vernon (Ed.), Computer vision: Craft, engineering and science (pp. 80–99).
Berlin: Springer Verlag. Available from info:vKA5pOB9aq8J:scholar.google.com
Sloman, A. (1996). Actual possibilities. In L. Aiello & S. Shapiro (Eds.), Principles of knowledge representation and reasoning: Proc. 5th int. conf. (KR ‘96)
(pp. 627–638). Boston, MA: Morgan Kaufmann Publishers. Available from http://www.cs.bham.ac.uk/research/cogaff/96-99.html#15
Sloman, A. (2001). Evolvable biologically plausible visual architectures. In T. Cootes & C. Taylor (Eds.), Proceedings of British Machine Vision Conference
(pp. 313–322). Manchester: BMVA. Available from http://www.cs.bham.ac.uk/research/projects/cogaff/00-02.html#76
Sloman, A. (2002). Diagrams in the mind. In M. Anderson, B. Meyer, & P. Olivier (Eds.), Diagrammatic representation and reasoning (pp. 7–28). Berlin:
Springer-Verlag. Available from http://www.cs.bham.ac.uk/research/projects/cogaff/00-02.html#58
39. Sloman, A. (2005). Perception of structure: Anyone Interested? (Research Note No. COSY-PR-0507). Birmingham, UK. Available from
http://www.cs.bham.ac.uk/research/projects/cosy/papers/#pr0507
Sloman, A. (2007). What evolved first and develops first in children: Languages for communicating? or Languages for thinking? (Generalised Languages:
GLs) (No. COSY-PR-0702). Birmingham, UK. (Presentation given to Birmingham Psychology department.
http://www.cs.bham.ac.uk/research/projects/cosy/papers/#pr0702)
Sloman, A. (2008a). Architectural and representational requirements for seeing processes, proto-affordances and affordances. In A. G. Cohn, D. C. Hogg,
R. M¨oller, & B. Neumann (Eds.), Logic and probability for scene interpretation. Dagstuhl, Germany: Schloss Dagstuhl - Leibniz-Zentrum fuer
Informatik, Germany. Available from http://drops.dagstuhl.de/opus/volltexte/2008/1656
Sloman, A. (2008b). Evolution of minds and languages. What evolved first and develops first in children: Languages for communicating, or languages for
thinking (Generalised Languages: GLs)? (Research Note No. COSY-PR-0702). Birmingham, UK. Available from
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#glang
Sloman, A. (2008c). A Multi-picture Challenge for Theories of Vision (Research Note No. COSY-PR-0801). Available from
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#talk88
Sloman, A. (2008d). A New Approach to Philosophy of Mathematics: Design a young explorer, able to discover “toddler theorems”. Available from
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#toddler
Sloman, A. (2009a). Architectural and representational requirements for seeing processes and affordances. In D. Heinke & E. Mavritsaki (Eds.),
Computational Modelling in Behavioural Neuroscience: Closing the gap between neurophysiology and behaviour. (pp. 303–331). London:
Psychology Press. (http://www.cs.bham.ac.uk/research/projects/cosy/papers#tr0801)
Sloman, A. (2009b). Ontologies for baby animals and robots. From ”baby stuff” to the world of adult science: Developmental AI from a Kantian viewpoint.
University of Birmingham, School of Computer Science. Available from
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#brown (Online tutorial presentation)
Sloman, A. (2009c). Some Requirements for Human-like Robots: Why the recent over-emphasis on embodiment has held up progress. In B. Sendhoff,
E. Koerner, O. Sporns, H. Ritter, & K. Doya (Eds.), Creating Brain-like Intelligence (pp. 248–277). Berlin: Springer-Verlag. Available from
http://www.cs.bham.ac.uk/research/projects/cosy/papers/#tr0804
Sloman, A. (2010a). If Learning Maths Requires a Teacher, Where did the First Teachers Come From? In A. Pease, M. Guhe, & A. Smaill (Eds.), Proc. Int.
Symp. on Mathematical Practice and Cognition, AISB 2010 Convention (pp. 30–39). De Montfort University, Leicester. Available from
http://www.cs.bham.ac.uk/research/projects/cogaff/10.html#1001
Sloman, A. (2010b). Phenomenal and Access Consciousness and the “Hard” Problem: A View from the Designer Stance. Int. J. Of Machine Consciousness,
2(1), 117–169. Available from http://www.cs.bham.ac.uk/research/projects/cogaff/09.html#906
Sloman, A. (2011). What’s information, for an organism or intelligent machine? How can a machine or organism mean? In G. Dodig-Crnkovic & M. Burgin
(Eds.), Information and Computation (pp. 393–438). New Jersey: World Scientific. Available from
http://www.cs.bham.ac.uk/research/projects/cogaff/09.html#905
Sloman, A., & CoSy project members of the. (2006, April). Aiming for More Realistic Vision Systems (Research Note: Comments and criticisms welcome No.
COSY-TR-0603). School of Computer Science, University of Birmingham. (http://www.cs.bham.ac.uk/research/projects/cosy/papers/#tr0603)
Tarsitano, M. (2006, December). Route selection by a jumping spider (Portia labiata) during the locomotory phase of a detour. Animal Behaviour, 72, Issue 6,
1437–1442. Available from http://dx.doi.org/10.1016/j.anbehav.2006.05.007
Trehub, A. (1991). The Cognitive Brain. Cambridge, MA: MIT Press. Available from http://people.umass.edu/trehub/
Seeing possibilities Slide 38 Last revised: June 10, 2015