7. Breakdown
● Build a local database of Rijksmuseum images
● Detect the face(s) from the webcam
● Find similarities between webcam shot and Rijksmuseum images
10. Face detection
● OpenCV: Good old HAAR cascade classifiers (for eyes, eyebrows, nose,
etc) based on “Rapid Object Detection using a Boosted Cascade of Simple
Features” paper from 2001.
● OpenFace: based on (a) Dlib HOG (“shape predictor 68 face landmarks”)
and (b) famous “FaceNet: A Unified Embedding for Face Recognition and
Clustering” paper from 2015.
11. Face detection (page 2): HAAR
https://docs.opencv.org/3.4.1/d7/d8b/tutorial_py_face_detection.html
22. Finding similarities: Breakdown
● Normalise input (resize originals)
● Transform (align) faces
● Generate image representation
● Store representations in the search index (as numpy nd arrays)
● The more two (or more) faces are similar, the less is the Euclidean
distance between their representations
23. What the final product looks like
Native client/QT5
(Raspberry PI 3)
React client/Browser
(PC)
REST API Core/OpenFace
24. Core
● OpenFace (Torch, Dlib, OpenCV and other deps)
● Scripts for face-detection and feature extraction from the given image.
● Search index to match against the given image.
25. REST API
● Django for administrative tasks and ORM (syncing Rijksmuseum data)
● Django REST framework for API (submit an image, get a match)
26. Clients
● Native QT5 client for Raspberry PI or another Linux PC. Uses OpenCV for
face detection.
● React client/browser for Raspberry PI or any other PC. Strange browsers
are not supported, but works on Firefox, Chrome, Opera, Safari and Edge.
Does not use face detection.
28. QT5 client: What worked well
● Styling was easy. I could quickly resemble the look & feel of a browser
app.
● Works fast.
● Runs on Raspberry PI or any other Linux PC.
29. QT5 client: Issues faced
Problem
● Raspberry PI would crash after 9-10 cycles (out of memory).
● Ubuntu would live much longer, but memory is definitely leaking.
● Each cycle would add about 3Mb of memory.
It was obvious that memory occupied by images is not being cleaned properly.
30. QT5 client: Issues faced (page 2)
class Application(object):
def overview(self):
raw_image = read_raw_image_from_camera()
show_webcam_image_in_gui(raw_image) # Show image in GUI
faces = find_faces(raw_image)
if len(faces):
match = get_matches_from_server(raw_image)
if match:
go_to_view(self.detail, match)
def detail(self, match):
show_webcam_image_in_gui(match) # Show image in GUI
show_match_in_gui(match) # Obtain image, show in GUI
go_to_view(self.overview)
31. QT5 client: Issues faced (page 3)
What didn’t work
● I did try setting variables to None:
raw_image = None
● Or even deleting them like:
del raw_image
● Calling garbage collector manually (in various places)
gc.collect()
32. QT5 client: Issues faced (page 4)
Solution
● For image resources, all local variables have been replaced with class
properties. Due to that change, the garbage collector cleans up the
memory properly.
33. QT5 client: Issues faced (page 5)
class Application(object):
def __init__(self):
self.raw_image = None
self.match = None
def overview(self):
self.raw_image = read_raw_image_from_camera()
show_webcam_image_in_gui(self.raw_image) # Show image in GUI
faces = find_faces(self.raw_image)
if len(faces):
match = get_matches_from_server(self.raw_image)
if match:
go_to_view(self.detail)
def detail(self):
show_webcam_image_in_gui(self.match) # Show image in GUI
show_match_in_gui(self.match) # Obtain image, show in GUI
go_to_view(self.overview)
Demq.ai
Find your art twin in the collection of the Rijksmuseum
What’s this talk all about?
This talk is about sharing my experience of using available face recognition techniques (mainly)
as well as some other tools that might be useful for you.
It is meant to inspire you to it yourself.
About me (page 1)
Tiny bits about me.
I work at Goldmund, Wyldebeast & Wunderliebe.
This is the snap-shot from the website.
We are busy with:
Web development
AI
Tinkering
We organise:
(Periodic) Mengvoer sessions (in Grand Theatre)
Groningen.AI conference
It’s small company that does cool things.
About me (page 2)
A little bit more about me. A snap-shot from my GitHub page.
History
Straight to the topic!
Some time ago, Rijksmuseum (Museum of the Netherlands) made a public API for accessing their gallery.
Idea
Build an application for finding your art twin in the collection of the Rijksmuseum.
Or simply said, find your look-alike from centuries ago.
All just for fun.
Breakdown
Build a local database of Rijksmuseum images
Detect the face(s) from the webcam
Find similarities between webcam shot and Rijksmuseum images
Rijksmuseum images database
I watched through a lot of images and identified 5 main categories we were interested in.
They were: schilderij, tekening, prent, portret and ontwerp
For better results I had to pick images carefully (to avoid violent and inappropriate paintings).
Because, obviously, not everyone would be happy to be identified as dead body or a naked person or even a slave.
Nope, the selection process wasn’t automated.
Available solutions
There are many libraries out there.
OpenCV
facenet
face_recognition
open face
dlib
...and many others.
I'll be focusing on OpenCV and OpenFace.
Face detection
OpenCV (HAAR): Good old cascade classifiers (for eyes, eyebrows, nose, etc) based on “Rapid Object Detection using a Boosted Cascade of Simple Features” paper from 2001.
OpenFace: Based on (a) Dlib models (of which I used “shape predictor 68 sixty eight face landmarks” model) and (b) famous “FaceNet: A Unified Embedding for Face Recognition and Clustering” paper from 2015.
Both are considered modern machine learning approaches.
Why these two? I made some tryouts. First with OpenCV, then with OpenFace. I was very pleased with results. I didn’t look further, but you may try.
Would I use something else now? No, but I plan to test FaceNet one day soon (which uses TensorFlow and is also based on the famous "FaceNet: A Unified Embedding for Face Recognition and
Clustering" paper).
Let’s detect some faces from old paintings using both!
Face detection (page 2): OpenCV/HAAR
Each feature is a single value obtained by subtracting sum of pixels under the white rectangle from sum of pixels under the black rectangle.
On the picture below, the first feature selected seems to focus on the property that the region of the eyes is often darker than the region of the nose and cheeks.
The second feature selected relies on the property that the eyes are darker than the bridge of the nose.
The final classifier is a weighted sum of weak classifiers.
Weak classifiers alone can't classify the image, but together with others they form a strong classifier.
The paper says even 200 features provide detection of ninety five 95% accuracy.
Read more at the link given https://docs.opencv.org/3.4.1/d7/d8b/tutorial_py_face_detection.html
Image detection (page 3): OpenFace/Dlib
OpenFace can use both OpenCV or Dlib HOG classifiers for face recognition.
We use Dlib HOG classifiers (specifically the “shape predictor 68 face landmarks” model classifier).
The landmarks are:
Eyes
Eyebrows
Nose
Mouth
Jawline
HOG algorithm will iterate on every pixel of a given image.
In each pixel, it checks all the pixels around it and figures out how dark the current pixel is compared with the directly pixels surrounding it.
Then it draws an arrow showing in which direction it’s getting darker.
The process is repeated on every pixel, then it breaks down in say sixteen by sixteen (16x16) pixels squares, where it counts up how many gradients point in each major direction.
Then that square is replaced with the arrow direction that was the strongest.
Using a lot of faces, a Linear Support Vector Machine model (for detect faces in an image) is trained.
Read more at the link given https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/
Face detection examples
Let’s detect some faces from old paintings using both!
Face detection (page 2): OpenCV HAAR
As you could see, OpenCV nicely detects most of the faces… but not all.
Face detection (page 3): OpenFace/Dlib HOG
Dlib makes a better job of it.
Face detection (page 4): OpenCV HAAR
Again, OpenCV detected almost everything, but some faces are detected double.
Face detection (page 5): OpenFace/Dlib HOG
And again, Dlib detections are better.
Face detection (page 6): OpenCV HAAR
OpenCV didn’t detect all of the faces.
Face detection (page 7): OpenFace/Dlib HOG
Dlib detected even face from portrait on the wall!
Face detection (page 8): OpenCV HAAR
OpenCV found some artefacts. I can’t clearly explain why.
Face detection (page 9): OpenFace/Dlib HOG
Dlib again made a better job of it, although it didn’t detect the boy on the left.
From the accuracy perspective, the choice is obvious.
But on the side of performance (memory usage, CPU), OpenCV clearly wins.
Finding similarities. Breakdown.
Now that we have images and models for detecting faces - what do we do next?
We obviously need to extract features from the images and store the data.
Before we start processing the images, we resize them (to 960 by 960). Otherwise indexing would take much longer.
The alignment preprocess faces for input into a neural network. Faces are resized to the same size (96 by 96) and transformed to make landmarks (such as the eyes and nose) appear at the same location on every image.
Generate representation of each image for later use.
Store the image representation in the search index.
The predicted similarity score of two (and more) faces is calculated by computing the Euclidean (squared L2) distance between their representations.
A lower score indicates that two faces are more likely of the same person. Since the representations are on the unit hypersphere, the scores range goes from 0 (the same picture) to 4.0.
What the final product looks like
Native client/QT5 (runs on Raspberry PI or any Linux PC)
React client/browser (runs on Raspberry PI or any PC)
Clients communicate with REST API, which communicates with the core.
The core is build with OpenFace: All the other requirements come directly from it.
REST API
Django for administrative tasks and ORM (for syncing Rijksmuseum data).
All images are indexed as native Python code. No ORM, we save time.
Various scripts for face-detection (both HAAR and HOG), searching for similar image in the given index.
Django REST framework for API (submit an image, get a match).
Various clients (can be ran on Raspberry PI or another Linux host).
Clients
Native QT5 client for Raspberry PI or another Linux PC.
Interactive. It starts with the overview page.
A person walks towards the camera, the countdown starts.
On zero the shot is taken, result page is shown.
After some time, client goes back to the overview page.
For Raspberry PI OpenCV was a good choice, since it’s much lighter and runs fine on low-end hardware.
It’s extremely fine to program on/for Raspberry
PI. Memory consumption becomes an issue quite quickly and you're forced to optimise and think of efficiency,
unless you want to get memory errors (segmentation fault). I got into a problem
of memory leaks (it took me a day to find out how to solve that).
React client/browser for Raspberry PI or any other PC.
Strange browsers are not supported, but works on Firefox, Chrome, Opera, Safari and Edge.
You navigate to demq.ai, look into the camera, press the button.
Show it is taken, result is shown.
I want to talk a little bit more about Raspberry PI and native QT5 client.
Native QT5 client
Has two views:
Overview.
Detail view.
QT5 client: what worked well
Styling was easy. I could quickly resemble the look & feel of a browser app.
Works fast.
Runs on Raspberry PI or any other Linux PC.
QT5 client: Issues faced
Raspberry PI would crash after 9-10 cycles (out of memory).
Ubuntu would live much longer, but memory is definitely leaking.
Each cycle would add about 3Mb of memory.
It was obvious that memory occupied by images is not being cleaned properly.
QT5 client: Issues faced (page 2)
This is some sort of a pseudo code.
In real application there were also format conversions.
There were a lot more variables holding intermediate resources.
class Application(object):
def overview(self):
"""Overview view."""
# Capture the camera image, find faces and send request to the server,
# get response and show the detail view for N seconds
raw_image = read_raw_image_from_camera()
show_webcam_image_in_gui(raw_image) # Show image in GUI
faces = find_faces(raw_image)
if len(faces):
match = get_matches_from_server(raw_image)
if match:
go_to_view(self.detail, match)
def detail(self, match):
"""Detail view."""
# Show data (from ``match``) and go back to overview after N seconds.
show_webcam_image_in_gui(match) # Show image in GUI
show_match_in_gui(match) # Obtain image, show in GUI
go_to_view(self.overview)
QT5 client: Issues faced (page 3)
What didn’t work
I did try setting variables to None:
raw_image = None
Or even deleting them like:
del raw_image
Then calling garbage collector manually
QT5 client: Issues faced (page 4)
Solution
For image resources, all local variables have been replaced with class properties. That it!
QT5 client: Issues faced (page 5)
Solved.
class Application(object):
def __init__(self):
self.raw_image = None
self.match = None
def overview(self):
"""Overview view."""
# Capture the camera image, find faces and send request to the server,
# get response and show the detail view for N seconds
self.raw_image = read_raw_image_from_camera()
show_webcam_image_in_gui(self.raw_image)
faces = find_faces(self.raw_image)
if len(faces):
match = get_matches_from_server(self.raw_image)
if match:
go_to_view(self.detail)
def detail(self):
"""Detail view."""
# Show data (from ``match``) and go back to overview after N seconds.
show_webcam_image_in_gui(self.match)
show_match_in_gui(self.match)
go_to_view(self.overview)
Tips
Actually, one tip.
If you plan to work on something similar, use Docker, since it can be much of a hassle to make it working on other platforms than Linux.
Just a couple of examples
Because, show must go on...
That’s me
This is Teodor - a colleague of mine
This is Erick Martijn - another colleague of mine
This is Jacob Klaassen. Director of the Goldmund, Wyldebeast & Wunderliebe.
And now I want to share a short story with you.
In about 5 weeks time, we got a working client-server demo.
Everyone (in our company) was very enthusiastic about it and eager to make a test (you should read it as: appear in front of the camera to see his/her match).
Our director, Jacob Klaassen, was among the “testers” too.
One of the matches he often got as top result looked so much like him (in my eyes) that I said - you know Jacob, it could be a relative of yours.
We all laughed.
Then his wife came by and saw the match too.
Women pay attention to the details. She did too.
The name of the matched person was Jacobus Scheltema.
Jacob, it's your family name! - she said, - He could really be a distant relative of yours.
It was Friday back then.
The next Monday Jacob told me that the person on the portrait is indeed his distant ancestor.
And that’s the end of the story.