Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

GPU Computing for Cognitive Robotics


Published on

This presentation shows the impact of GPU computing on cognitive robotics by showing a series of novel experiments in the area of action and language acquisition in humanoid robots and computer vision. Cognitive robotics is concerned with endowing robots with high-level cognitive capabilities to enable the achievement of complex goals in complex environments. Reaching the ultimate goal of developing cognitive robots will require tremendous amount of computational power, which was until recently provided mostly by standard CPU processors. However, CPU cores are optimised for serial code execution at the expense of parallel execution, which renders them relatively inefficient when it comes to high-performance computing applications. The ever-increasing market demand for high-performance, real-time 3D graphics has evolved the GPU into highly parallel, multithreaded, many-core processor extraordinary computational power and very high memory bandwidth. These vast computational resources of modern GPUs can now be used by the most of the cognitive robotics models as they tend to be inherently parallel. Various interesting and insightful cognitive models were developed and addressed important scientific questions concerning action-language acquisition and computer vision. While they have provided us with important scientific insights, their complexity and application has not improved much over the last years. The experimental tasks as well as the scale of these models are often minimised to avoid excessive training times that grow exponentially with the number of neurons and the training data. However, this impedes further progress and development of complex neurocontrollers that would be able to take the cognitive robotics research a step closer to reaching the ultimate goal of creating intelligent machines. This presentation shows several cases where the application of the GPU computing on cognitive robotics algorithms resulted in the development of large-scale neurocontrollers of previously unseen complexity, which enabled conducting the novel experiments described herein.

Published in: Technology
  • Be the first to comment

GPU Computing for Cognitive Robotics

  1. 1. Martin Peniak, Davide Marocco, Angelo Cangelosi GPU Computing for Cognitive Robotics GPU Technology Conference, San Jose, California, 25 March, 2014
  2. 2. This study was financed by: EU Integrating Projects - ITALK and Poeticon++ within the FP7 ICT programme Cognitive Systems and Robotics ARIADNA scheme of The European Space Agency Thanks to my supervisors Prof Angelo Cangelosi, Dr. Davide Marocco and Prof Tony Belpaeme for their support Thanks to Calisa Cole and Chandra Cheij from NVIDIA for their help Acknowledgements
  3. 3. New position at Cortexica Imperial College London • Leading provider of visual search and image recognition technology for mobile device • Creators of a bio-inspired vision system enabling intelligent image recognition using principles derived from the human sight
  4. 4. Action and language acquisition in humanoid robots Biologically-inspired Active Vision system Software development 4 Overview
  5. 5. Action and Language Acquisition in Humanoid Robots 6
  6. 6. Humans are good at learning complex actions Constant repetition of movements with certain components segmented as reusable elements Motor primitives are flexibly combined into novel sequences of actions Human motor control system known to have motor primitives implemented as low as at the spinal cord and hi-level planning and execution takes place in primary motor cortex 7 Learning Actions
  7. 7. 8 Explicit hierarchical structure vs multiple timescales
  8. 8. 9
  9. 9. 10 Initial testing of two actions Experimental Setup SOM and MTRNN trained on 2 sequences each repeated 5x with different positions Extended version of up to 9 action sequences Left and Right hand used individually MTRNN input: head, torso and arms (41 DOF) Update rate: 50ms
  10. 10. 11 Multiple Time-scales Recurrent Neural Network Experiment on action-language grounding – step 1 Proprioceptive Input MTRNNVisual Input Linguistic Input Action 1 Action 2 Action 3 Object 1 trained Object 2 trained Object 3 trained
  11. 11. 12 Results 0 0.0002 0.0004 0.0006 0.0008 0.001 0.0012 20 trials conducted and each reached the threshold error of 0.000005
  12. 12. 13 Multiple Time-scales Recurrent Neural Network Scaling up the experiment on action-language grounding Action 1 Action 2 Action 3 Action 4 Action 5 Action 6 Action N Object 1 trained trained trained trained trained trained trained Object 2 trained trained trained trained trained trained trained Object 3 trained trained trained trained trained trained trained Object 4 trained trained trained trained trained trained trained Object 5 trained trained trained trained untrained trained trained Object 6 trained trained trained trained trained untrained trained Object N trained trained trained trained trained trained untrained
  13. 13. 14 Multiple Time-scales Recurrent Neural Network Generalisation testing Experimental Setup For each of the 9 objects, SOM and MTRNN was trained on 9 sequences each repeated 6x with different positions. Total of 478 sequences each with 100 41-wide vectors. Left and Right hand used individually MTRNN input: head, torso and arms (41 DOF) Update rate: 50ms
  14. 14. Self-organising maps CPU vs GPU Performance
  15. 15. Multiple Time-scales Recurrent Neural Network CPU vs GPU Performance
  16. 16. Biologically-inspired Active Vision system
  17. 17. Specific template or computational representation is required to allow object recognition Must be flexible enough to account with all kinds of variations 18 Traditional Computer Vision “Teaching a computer to classify objects has proved much harder than was originally anticipated” Thomas Serre - Center for Biological and Computational Learning at MIT
  18. 18. 19 Biological Vision “Researchers have been interested for years in trying to copy biological vision systems, simply because they are so good” ~ David Hogg - computer vision expert at Leeds University, UK Highly optimized over millions of years of evolution, developing complex neural structures to represent and process stimuli Superiority of biological vision systems is only partially understood Hardware architecture and the style of computation in nervous systems are fundamentally different
  19. 19. 20 Biological Vision
  20. 20. Seeing is a way of acting 21
  21. 21. Inspired by the vision systems of natural organisms that have been evolving for millions of years In contrast to standard computer vision systems, biological organisms actively interact with the world in order to make sense of it Humans and also other animals do not look at a scene in fixed steadiness. Instead, they actively explore interesting parts of the scene by rapid saccadic movements 22 Active Vision
  22. 22. Evolutionary Robotics Approach Creating Active Vision Systems 23
  23. 23. New technique for the automatic creation of autonomous robots Inspired by the Darwinian principle of selective reproduction of the fittest Views robots as autonomous artificial organisms that develop their own skills in close interaction with the environment and without human intervention Drawing heavily on biology and ethology, it uses the tools of neural networks, genetic algorithms, dynamic systems, and biomorphic engineering 24 Evolutionary Robotics
  24. 24. 25 ... ... ... Population (Chromosomes) Evaluation (Fitness) Selection (Mating Pool) Genetic operators Artificial neural networks (ANNs) are very powerful brain-inspired computational models, which have been used in many different areas such as engineering, medicine, finance, and many others. Genetic Algorithms (GAs) are adaptive heuristic search algorithm premised on the evolutionary ideas of natural selection and genetic. The basic concept of GAs is designed to simulate processes in natural system necessary for evolution.
  25. 25. 26 Related Research Mars Rover obstacle avoidance (Peniak et al.)
  26. 26. 27 Method Evolution of the active vision system for real-world object recognition training the system in a parallel manner on multiple objects viewed from many different angles and under different lighting conditions Amsterdam Library of Object Images (ALOI) provides a color image collection of one-thousand small objects recorded for scientific purposes systematically varied viewing angle, illumination angle, and illumination color Active Vision Training trained on a set of objects from the ALOI library each genotype is evaluated during multiple trials with different randomly rotated objects and under varying lighting conditions evolutionary pressure provided by a fitness function that evaluates overall success or failure of the object classification trained on increasingly larger number of objects Active Vision Testing robustness and resiliency of recognition of the dataset generalization to previously unseen instances of the learned objects
  27. 27. 28 Experimental Setup Recurrent Neural Network Inputs: 8x8 neurons for retina, 2 neurons for proprioception (x,y pos) No hidden neurons Outputs: 5 object recognition neurons, 2 neurons to move retina (16px max) Genetic Algorithm Generations: 10000 Number of individuals: 100 Number of trials: 36+16 (object rotations + varying lighting conditions) Mutation probability: 10% Reproduction: best 20% of individuals create new population Elitism used (best individual is preserved)
  28. 28. 29 Experimental Setup Each individual (neural network) could freely move the retina and read the input from the source image (128x128) for 20 steps At each step, neural network controlled the behavior of the system (retina position) and provide recognition output The recognition output neuron with the highest activation was considered the network’s guess about what the object was Fitness function = number of correct answers / number of total steps
  29. 29. GPUs were used to accelerate: Evolutionary process – parallel execution of trials Neural Network – parallel calculation of neural activities 30 GPU Accelerating GA and ANN
  30. 30. 31 Results Fitness can not reach 1.0 since it takes few time-steps to recognize an object All objects are correctly classified at the end of the each test 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 fitness generations best fitness average fitness
  31. 31. 32 Evolved Behavior
  32. 32. Software development
  33. 33. Application Code + GPU CPU Use GPU to Parallelise Compute-Intensive Functions Rest of Sequential CPU Code Heterogeneous computing Host Device
  34. 34. Heterogeneous software architecture for the development of modules loosely coupled to their graphical user interfaces Provides simple and user friendly GUI client Distribute, control and visualise existing modules Generate new modules Monitor connected server Tools Modules Run heterogeneous CPU-GPU code doing the actual work What is Aquila?
  35. 35. Developed in C++ and CUDA Cross-platform Linux OSX Windows Dependencies Qt YARP CUDA What is Aquila?
  36. 36. YARP messages YARP Interface Interface.cpp GPU Kernels Main Thread moduleName.cpp GPUCPU YARP Interface moduleNameInterface.cpp Module Settings GUI Implementation moduleNameSettings.cpp Module GUI Implementation modulename.cpp Module GUI Design moduleName.ui Module Settings GUI Design moduleNameSettings.ui Tab 1 Name: moduleName Instance: instanceID Server: serverID Tab 2 Tab N AquilaGUIAquilaModuleGUIinTab1AquilaModule YARP messages YARP messages Othermodules
  37. 37. Existing Aquila Ecosystem SOM Self-organising Map ERA Epigenetic Robotics Architecture Tracker Object tracking MTRNN Multiple Time-scales Recurrent Neural Network 0.0 10.0 20.0 30.0 40.0 50.0 60.0 264 1032 2056 4104 Speed-up Neurons MTRNN Benchmark Example 2xGTX580(P2P) vs 8 core Intel Xeon ESN Echo State Networks
  38. 38. Thank you! 39 "Imagination is the highest form of research" Albert Einstein