5. 5
Perception
de
l’env
Explicabilité
Raisonnement hybride Raisonnement distribué
Champs scientifiques principaux
Data/Graph
Mining
Ingénierie des
connaissances
Apprentissage
machine
Vision par ordinateur
Robotique
Systèmes multiagents
Program
mation
logique
Optimisation
meta-heuristique
ou bio-inspirée
UX
Réalité
virtuelle
TA
L
Génie
logiciel
Simulation
Visualisa
tion
données
6. 6
Axes de recherche transversaux
01 03
02
Véracité - Valeur
Perception,qualification de
la véracité et de la valeur de
connaissances dans des
environnements intelligents
massifs
Recommendation et simulation
prescriptive pour des systèmes
distribués ou complexes
Comportements et raisonnement
distribués dans des systèmes
complexes
(e.g. cyber-physical systems)
Simulation
prescriptive
Raisonnement distribué
10. Autonomous Driving system consists of three main parts
• Perception
• Planning
• Control
and each part includes different tasks that are expected to be fully understood by the
system.
AUTONOMOUS VEHICLES
11. Sensors and Perception Models
Relies heavily on an extensive
infrastructure of active and
passive sensors.
PERCEPTION
13. PLANNING
Future Trajectories based on the results coming from
perception part.
Choosing appropriately ego trajectory, and driving
behavior is created and planned.
14. CONTROL
• The control part is deeply coupled with the perception and planning parts.
• It guarantees that the vehicle follows the course set by the planning part and
controls the vehicle’s hardware (acceleration, braking, and steering using drivers
and actuators) for safe driving.
16. ENVIRONMENT PERCEPTION
The principal goal is:
• Designing computer systems that possess
the ability to capture, understand, and
interpret important visual information
contained with image, video and other
visual data.
• Then translate this data, using contextual
knowledge provided by human beings, into
insights used to drive decision making.
Computer Vision
Technology
17. ENVIRONMENT PERCEPTION
Training computer vision systems used to
involve learning process all the way to the
smallest granular units of visual data – PIXEL.
• The system records and evaluates digital
images on the basis of its raw data alone,
where minute differences in pixel density,
color saturation, and levels of lightness and
darkness determine the structure and
therefore identity of the larger object.
Computer Vision
Technology
18. ENVIRONMENT PERCEPTION
is one of the most remarkable things to come out of the deep learning and artificial
intelligence world. The advancements that deep learning has contributed to the computer
vision field have really set this field apart.
Computer Vision
Technology
19. ENVIRONMENT PERCEPTION
Combining sensor & vision technologies leads to
Environment Perception → Scene Understanding
Scene Understanding: Analysis of a scene, considering the semantic and geometric
context of its contents and the internal relations between them.
Humans can classify, locate, segment, and identify objects and features at one look.
20. ENVIRONMENT PERCEPTION
Humans can classify (object type and status moving/static), specify (spatial position),
identify (motion, position, direction, and velocity), and track these objects in the
driving scene.
In addition, humans focus their visual attention more on important or purposeful
elements and ignore unnecessary ones in their field of view.
Conferring these phenomenal abilities into machine-learning systems has been a long-
standing goal in the field of computer vision
21. ENVIRONMENT PERCEPTION
Numerous approaches and methods (classic algorithms and deep learning) have been
proposed to improve scene understanding and extract semantic information about the driving
environment from images and videos.
22. ENVIRONMENT PERCEPTION – Visual Attention
Ability of sensing and perceiving the driving environment is a key technology for ADAS and
autonomous driving.
Humans' visual attention
• Predicting or locating potential risk
• Understanding the driving environment
• Quickly locate objects of interest
23. • “Visual Attention” - selection of
relevant and filtering out of
irrelevant information from
cluttered visual scenes.
• Attention - operate on regions of
space, particular features of an
object, or on entire objects.
• Attention can also be directed either
overtly or covertly.
ENVIRONMENT PERCEPTION – Visual Attention
24. How should a machine learning system or an autonomous vehicle
acquire this ability to detect such attentions for safe driving?
• To tackle this
– Incorporate saliency mechanism (Salient object detection model) as visual
attention model
– Saliency refers to unique features (pixels, resolution etc.) of the image in the context
of visual processing. These unique features depict the visually alluring locations in an
image. Saliency map is a topographical representation of them.
– Salient object detection models mimic the behaviour of human beings and capture
the most salient region/object from the images or scenes.
ENVIRONMENT PERCEPTION – Visual Attention
25. The concept of saliency map was first proposed by Christof
Koch in 1985.
Feature Integration Theory - 1980 " defining human visual
search strategies
“The salient areas in the visual scene are identified by
combining or relating visual features information such as
color, orientation, spatial frequency, brightness, direction of
movement that direct human attention”
The visual attention methods using saliency are divided into
two categories:
– Bottom-up: Biologically inspired methods Image color
and intensity are common examples
– Top-down: True computational methods
Prior knowledge, memories, goals are common factors.
Bottom-up
ENVIRONMENT PERCEPTION – Visual Attention
Top-down
26. • Saliency Algorithms
Review and test some well known Saliency algorithms both classic and deep learning
Fig: Diffèrent Saliency Algorithms test on driving scene images
ENVIRONMENT PERCEPTION – Visual Attention
27. Existing works
Human eye tracking into the process Adding object detection
In -lab Simulation
Real Driving
Berkeley DeepDrive Attention Dr(eye)VE SAGE
ENVIRONMENT PERCEPTION – Visual Attention
28. The research shows that these models have contributed a lot to the deployment of
attention and have significant progress.
Some limitations and drawbacks
– Complexity of capturing the true driver attention
– The fixations: Subjected to various characteristics of driver
• Driving experience & habits, preference & intentions, capabilities, culture, age,
gender, etc
– Eye tracker record single location at each moment that driver is gazing at, while he
may be looking at multiple important objects in the scene.
– The computational cost for developing a dataset with
saliency maps has been relatively expensive.
ENVIRONMENT PERCEPTION – Visual Attention
29. Our Aim: Develop a visual attention framework capable of predicting the important
objects (road context) simultaneously in a driving scene.
ENVIRONMENT PERCEPTION – Visual Attention
We came up with a new idea by shifting the problem from PREDICTION
What/where the driver is looking, or most drivers would look at to
SELECTION What the driver should/must look at during driving
30. ENVIRONMENT PERCEPTION – Visual Attention
Saliency Heat-Map as Visual Attention for Autonomous Driving
using Generative Adversarial Network (GAN)
31. Our Approach
Generative Adversarial Network - GAN’ (by Ian Goodfellow in 2014)
– GAN is a type of neural network architecture for generative modeling. It involves using a
model to generate new examples that plausibly come from an existing distribution of
samples, such as generating new photographs that are similar but specifically different
from a dataset of existing photographs
Applications: to name few
Image-to-Image Translation
Face Frontal
View Generation
Photos to Emojis Photo Inpainting
ENVIRONMENT PERCEPTION – Visual Attention
32. Framework
• We borrow the pix2pix GAN architecture which is suitable for image-to-image translation
task and can be conditioned on the input image for generating corresponding output image.
ENVIRONMENT PERCEPTION – Visual Attention
33. Data Collection
– The data nowadays for developing saliency models are collected from human eye fixation or
gazing. These data as saliency map (gray-scale or heat map image) are obtained using Gaussian
probability function, that exhibits the probability of each image pixel that captures human
attention.
Examples of fixation selection – Prediction points
(MITdataset)
ENVIRONMENT PERCEPTION – Visual Attention
34. Data Collection
– We propose a different approach for data collection by taking advantage of
semantic label information from driving scenes datasets.
ENVIRONMENT PERCEPTION – Visual Attention
43. Limitations and Future Work:
• Not all these objects are demanding all the time
• Detects false regions as salient
• Triggers false detection due to direct sunlight or reflections of light
Incorporate depth, location, motion information, this could help for giving priorities within the
detected objects
ENVIRONMENT PERCEPTION – Visual Attention
44. ENVIRONMENT PERCEPTION
• Based on our research into visual attention, we have concluded that in addition to visual
attention, we require the incorporation of depth, location, and motion information, which
could aid in the assignment of priorities among the detected objects.
• It is critical to have a thorough understanding of each of the surrounding elements
(including the motion and geometry information).
• Road-users are critical for perception, planning and decision-making for both self-driving
cars and driver assistance systems.
• Some road-users, however, are more important for decision making than others because
of their respective intentions, ego-vehicle’s intention and their effects on each other.
46. Aim: To propose a framework that can extract motion and geometry related information (i.e.,
object class, status, position, movement, speed and distance information) to identify object
(road-user) characteristics in the urban driving scenario.
Using these and other semantic cues, we may be able to determine the most important (prior)
objects in a given scene while driving.
ENVIRONMENT PERCEPTION – Object Identification
47. Motion and Geometry-related Information Fusion through
a Framework for Object Identification from a moving
camera in Urban Driving Scenarios
ENVIRONMENT PERCEPTION – Object Identification
48. We have reviewed the contributions of works most related to ours,
– i.e., scene understanding for driving by combining motion and geometry-related information.
SMS-Net
ENVIRONMENT PERCEPTION – Object Identification
50. • Framework's components includes
– Disparity Estimation through the Semi-Global Matching
– Motion Estimation by Image Registration and Optical Flow
– Moving Object Detection (MOD)
– Information Extraction and Fusion
DISPARITY ESTIMATION
– We adopt a well known Semi-Global Matching (SGM) algorithm.
ENVIRONMENT PERCEPTION – Object Identification
51. ENVIRONMENT PERCEPTION – Object Identification
MOTION ESTIMATION
– One of the most widely used methods is optical flow (OF) estimation.
– OF provides satisfactory results when the camera is fixed or carefully displaced.
– However, the optical flow from image sequences acquired by a moving camera encodes two
pieces of information.
• The motion of the surrounding objects
• The ego vehicle's motion
results in significant motion vectors associated with the static objects, leading to an
incorrect perception of static objects as moving objects.
52. MOTION ESTIMATION
Approach to Motion Compensation:
– By recent trends in aerial and medical imaging, we suggest a procedure to be used called image
registration together with the optical flow method to overcome ego-motion and obtain true
estimation of the motion information.
IMAGE REGISTRATION
With Image Registration
Without Image Registration
ENVIRONMENT PERCEPTION – Object Identification
54. MOVING OBJECT DETECTION (MOD)
We followed a straightforward approach for moving object detection, that is
• To detect the objects of interest first
• Then, identify the moving ones from detected
objects from two consecutive frames
Current
frame
Previous
frame
Moving
Objects
OBJECTDETECTION
• Classification
• Localization
• Segmentation of all the
objects in the scene
IDENTIFY MOVING OBJECTS
• Recognitionof Moving
objectsfromdetectedobjects
inthe scene
ENVIRONMENT PERCEPTION – Object Identification
55. MOVING OBJECT DETECTION (MOD)
E D
MOD
Moving Objects Mask
Moving Object
Detection
Encoder-Decoder
Network
Superimpose image frame and
corresponding moving object mask, and
integrate Bbox and Class information
Seg Mask
Bbox and Class
Segmentation
Network
The proposed architecture incorporates the object segmentation, and binary pixel classification based on
temporal information
The object segmentation part is a pre-trained CenterMask Lite segmentation network inference, which gives the bounding
boxes, category probabilities, and segmentation mask for each object of interest
The temporal processing part is an encoder-decoder network (EDNet) that identifies the moving objects using segmented
masks of consecutive frames
ENVIRONMENT PERCEPTION – Object Identification
56. MOVING OBJECT DETECTION (MOD)
The proposed model labels the moving objects (as
white) and the static/background (as black) from
sequence pair images.
ENVIRONMENT PERCEPTION – Object Identification
57. PROPOSED MOD DATASET
We developed a large dataset for moving object detection (from KITTI and EU-life long
datasets) covering all the dynamic objects like all types of vehicles, pedestrians, cyclists,
motorcyclists, bus, train, and truck.
ENVIRONMENT PERCEPTION – Object Identification
61. FUSION OF MOD, FCOF AND DISPARITY
The results of each stage of the proposed framework, such as disparity, moving object
detection, and motion estimation, are fused to extract information such as object ID, static
or moving, distance, direction, position, and velocity.
– Labelling and Scaling
ENVIRONMENT PERCEPTION – Object Identification
70. CONCLUSION
• Proposal of a new framework for object identification (FOI) from a moving camera in complex urban
driving environment.
• The framework relies only upon the images captured from the stereo camera.
• It extracts the information related to the object, including class, status (moving/static), direction, velocity,
position, and distance from the ego vehicle.
• Other contributions related to Moving Object Detection (MOD) as it is considered as the critical task.
– A new dataset for Moving Object Détection is built from existing driving datasets KITTI and EU
long-term, covering dynamic objects like all types of vehicles, pedestrians, bicyclists, and
motorcyclists.
– Propose to use image registration as a tool for ego-motion compensation for urban driving scenarios.
– A new model for moving object detection is developed by integrating an encoder-decoder network
with a segmentation model.
ENVIRONMENT PERCEPTION – Object Identification
71. LIMITATIONS: Through the experiments, it is found that
Objects overlap or very close to each other Object reflection Object motion speed is same as ego-vehicle speed
ENVIRONMENT PERCEPTION – Object Identification
72. FUTUREWORK
– More attention should be paid to improving the overall speed of the proposed framework.
– Build a new dataset – Fully Compensated Optical Flow color maps.
– Future studies will be devoted to develop a framework that can prioritize objects in a driving
scene according to the situation.
– We will use the perception data (object identification) to plan secure and smooth trajectories for
the objects of interest using their dynamics limits, navigation comfort and safety, and the traffic
rules.
– Lane detection, traffic sign detection, and live traffic light detection could be integrated into the
framework, which would help the system in various tasks such as lane-change, obstacle
avoidance, and combined in critical driving situations.
ENVIRONMENT PERCEPTION – Object Identification
73. Road-user importance estimation during a left turn maneuver
• In real-world driving, there can be a variety of items in the immediate neighborhood of the
ego-vehicle at any given time.
• Some items have a direct impact on the behavior of the ego-vehicle (e.g., brake, steer),
while others have the potential to be a danger, and still others do not pose a danger
currently or soon.
74. • The ability to determine how important or relevant a given object is to an ego-decision
vehicle's is critical for both driver assistance systems and self-driving vehicles.
Establishing trust with human drivers or passengers
Demonstrating transparency to law enforcement
Promoting a human-centric thought process etc
Road-user importance estimation during a left turn maneuver