2. Background
Why Food Recognition?
According to Healthy Western Australians, Food labeling is
important to help people to get information about
description of food, ingredients, nutritional information,
etc.
Why Recognition through Images?
When social media user increased and getting popular,
most of people grab their cellphone and snap a photo of
food what they bought than grab a fork or spoon.
3. Purpose and Impact
Purpose
To help users to get complete information about food what
they captured before buy a food
Impact
To get useful information about food that recognized from
the user by theirs mobile device
The data that recorded will useful for analytics or
predicting purpose
4. Limitation
Food type that can be recognized
Nasi goreng, mie goreng, pecel lele, bakso, gado-gado,
sushi, pizza, sate, soto, rendang, ayam bakar, soto bakar.
A number of food image sample data to be learned by
machine learning is impacting to recognize result.
Recorded data will show for statistic only not to
predicting.
5. Technology
We have two categories for this development
Cloud or Software as a Service including Server-Client system application
development
Artificial Intelligence System, particular in Computer Vision and its Machine
Learning
Almost AI system difficult to use, not user friendly or not ready to be used
by ordinary user. Our development is integrating all these things.
6. AI System
Machine Learning, a system that store a knowledge as its intellegence
We’re collecting about 6000 images within 13 food type
We’re using Convolution Neural Network method as known Deep Learning as our
machine learning to classify 13 food type that we decided.
We’re spliting the data into two, 4234 for training set and 1411 for validation
All images are converted to 150 by 150 pixels dimension
7. Deep Convolutional Neural Network
Architecture Design
1st convolution
layer
(128 features),
use pooling and
Relu
2nd convolution
layer
(64 features),
use pooling and
Relu
3rd convolution
layer
(32 features),
use pooling and
Relu
4th convolution
layer
(32 features),
use pooling and
Relu
5th convolution
layer
(32 features),
use pooling and
Relu
Fully connected
layer
outputs
Input image
9. Training Result
- The best training accuracy is 82 %
- Note: it is not accuracy for all training samples (4234 images)
but the accuracy of 200 images taken randomly from 4234
images
- The best validation accuracy is 84 %
- Note: it is not accuracy for all validation samples (1411
images) but the accuracy of 200 images taken randomly from
1411 images
- The training process is still underfit (so it can be continued
to improve the performance)
- The output of this training is a AI model (contain
CNN architecture, weights, and biasses)
11. Cloud System Architecture
AI Engine works in loop to check any unrecognized
images that retrieved from mobile devices by web
services
Mobile Device works in loop to check a result from
server response until timeout passed after mobile
device captured a food image.
Every mobile device use the apps, the apps will send
device serial number as its identity