2. Computer Vision
Image understanding. This is the science of acquiring, processing, analyzing,
and understanding images and videos from the
real world using computational methods
to produce numerical or symbolic
information in the forms of
decisions.
Ultimate goal is to model, replicate and exceed human vision using computer
software and hardware at different levels.
4. Neural Networks - used in machine
learning
Inspired by biological neural networks such as the central nervous system
Machine learning method - systems are learning from data.
Artificial neural networks can model mathematically the way biological brains work which
allows the machine to learn to "think” in the same way that humans do, making them
capable of recognizing speech, objects, and emotions or moods of people thus allowing the
machine to make decisions like humans do.
5. Emotion Intelligence
Emotional Intelligence is the ability to recognize, express
and have emotions, harness them to constructive
purposes, and skillfully handle the emotions of others.
Emotions play a critical role in rational and intelligent
behavior.Emotions are difficult to encode in a computer
program. Important for computers to recognize emotions in
order to provide better services.
7. Emotion and Mood Detection
Teaching the computer to identify and adjust to human emotions.
The approach for teaching the computer to detect human emotion is through
the use of egocentric vision.
Use first person video cameras to get a first person view/perspective of a
situation.
Research needs more authentic, unscripted, and candid data to train the
computers.
8. Step 1 - Gather Data
First thing to do is to gather data using
Ion first person mini cameras. You will
be our data throughout the year!
9. Step 2 - Face detection
Feed the data(video) into a computer algorithm for facial
detection.
The computer identifies faces vs non-faces in an image or in
a video using high-dimensional vector patches.
10. Vector Patches
Computer scans an image, the image is broken
into a grid and then each grid is written as a
high dimensional vector patch.
The computer identifies if the patch vector is a
facial feature or a non-facial feature.
Linear regression is used to
separate the two.
11. Step 3 - Face recognition
Once image is cropped, a feature extraction method is used to form feature
vectors . Intensity is the simplest extraction method.
The feature vector is then passed through a Principle Component Analysis
(PCA) to reduce to a two dimension vector for a more tractable number.
A covariance matrix is then made in the PCA using over a thousand faces from
a training database. One of the possible data sets is known as Labeled Face in
the Wild (LFW). This contains over 13,000 images collected from the web.
14. Cosine Similarity (one method)
Cosine Similarity Metric Learning (CSML) then transforms to
Apply CSML to each type of feature then produces a similarity score.
The scores from the vectors are passed to a Support Vector Machine (SVM) for
verification.
15. Laplacian Embedding Process
(another method)
Static(non-moving) facial expression features are taken from a photo. (Stored
in a data base).
An assessment of the geometrical relations among facial
feature points are done.(Basically- an emotion causes
facial deformation that can be measured in terms of the
angle or distances between specific facial feature points.)
Angles are separated into two groups belonging to the
upper part of the face and the lower part of the face. The
lower-part angles are involved for expressing joy, sorrow or
fear. The upper part of the face angles for expressing anger,
fear. These angles build a six-dimensional feature vector
expressed:
16. Laplacian Embedding Process
Continued
The simplest motion-dependent facial features can be defined as the
displacements (Euclidean distance) of these facial feature points between a
neutral facial expression and the “peak” of a particular emotive expression.
Comparing the point changes from a given neutral face to the change or
displacement on the “reaction” face.
Every input facial expression is quantified as a motion-dependent facial
expression feature vector as follows:
17. Laplacian Embedding Process
Based on the information in the computer’s data
base it then combines these pieces of
information and “detects” an emotion that is
being shown.
19. Step 4 - Computer Training
Many datasets are available. Each with their strength and weakness. A few
examples are FERET - Facial Recognition Technology and LFW - Labeled Face
in the Wild.
The computer is given the images along with a word description of “happy, sad,
angry, fear or disgust”. It is basically told what the human emotion is in the
image or video.
Training, input of image programmer tells computer happy, sad…...
20. Step 5 - Testing the computer
At this point the computer is fed images or video and it uses the algorithms
and training to give an emotion label to the image or video that was entered.
Depending on the program, the computer can use a technique such as the
Cosine Similarity recognition or it could use the Euclidean Distance programs
all of which also use SVM (Support Vector Machine) for verification to
determine the emotion that is displayed.