5 Practical Steps to a Successful Deep Learning Research

5 Practical Steps to a
Successful
Deep Learning Research
Amir Alush, Phd
Co founder & CTO

Brodmann17
Founded in 2016, 19+ team, mostly M.Sc. / Ph.D. machine learning researchers.
Backed by: lool Ventures, Maniv Ventures, Sony Innovation & SamsungNEXT
Brodmann17 has designed a Deep learning technology independently (patents
pending). from scratch, with optimal performance and accuracy by design.
Brodmann17 is developing perception software for the world’s largest Tier-1
automotive suppliers, for pre-install/aftermarket ADAS & autonomous driving.

Things I’ll Talk About
1. Requirements
3. Data Annotation
4. Research Evaluation Metric
5. Research
2. Data Collection

Step 1: Set your Requirements
Open Research: can lead to new great products but it’s also risky.
Always keep this alive!
Product oriented research: must have clear requirements:
● What is the task?
● What is the data?
● What is the target platform? cpu (arm/intel), gpu(arm/nvidia)

Step 1: Set your Requirements (example)
Smart Doorbell “Requirements”:
● Task:
○ Alert when a human appears (once) with 98%Recall, 1
False Alarms per week
○ 0.3-1.5 meters distance from camera
○ Full / upper body only
○ Unique id per person
● Input:
○ 720p RGB images, 30fps
○ Camera height: 1.5 meters from ground
● Platform & Run-Time:
○ Raspberry Pi 3, 1xA53 ARM CPU
○ 0.5 sec latency from appearance to alert

Step 2: Data Collection
Data collection is a long and expensive process:
● Long: proprietary setup, requires variability, takes its times,
depending on another company
● Expensive: buying data, special setup, storage, management, etc.
You should:
● Start early
● Collect the Right data. Plan thoughtfully, wrong data could hold back
your product release

Step 2: Data Collection (quantity)
How much data do I need ?
● Quantity is Important, it comes with a price tag and time
● Quality is more important
It’s a continuous process:
1. Start with a small subset for fast POC to reduce risks
2. Increase the collection rate
3. Data collection to improve research metrics
Just putting it here:
● Academic data
● Synthesizing data

Step 2: Data Collection (quality)
Meet product requirements:
● Same modality
● Cover expected operation mode distribution
(scene appearance, objects appearance, viewpoint, etc..)
Using Pascal/Coco for the Smart Doorbell?

● Same modality
● Cover expected operation mode distribution
(scene appearance, objects appearance, viewpoint, etc..)
Doorbell camera example
X X X X OK

● Same modality
● Cover expected operation mode distribution (e.g. scene appearance,
objects appearance, viewpoint, scene type)
Traffic monitoring application
X
X
X
X
X OK

Data with Variability
● Collecting a correlated data set is easy
● Data under different conditions: e.g. location, time of day, season,
weather condition
● Data coming from multiple sources (cameras, devices)

Step 3: Data Annotation (Quantity)
More Expensive and time consuming than the collection part
● Could cost up to several $ per frame!
● Understand what you’ll be needing in the research phase

Step 3: Data Annotation (Quantity)
Choose what data to annotate:
● You should not annotate all your data
● Annotate quality data
It’s a continuous process:
● Start with a small subset and a fixed annotation scheme
● Increase the annotation rate

Step 3: Data Annotation (Quality)
Supervised Learning:
● This is the actual data your models are trained with
● Your model will get as good as your data!
Annotation guidelines are derived from Product Requirements
● Usually not straightforward
● Should be fine detailed
● New data annotation scheme / re annotate /cleaning to improve
research metrics

How would you annotate this person?

How would you annotate this face?
Consistency and clarity is important:
● Not to confuse your learning process
● Not to confuse your annotators
● Not to fail your research evaluation metric
● Other algorithms are depending on this
annotation

How would you annotate these objects?

Quality Assurance:
● Several annotators → costly
● Familiar annotators (with a name) are a good choice
● Tight definition of the task
● Automatic validation
● Simple tasks, or pre-process to simplify

Step 3: Data Annotation (Costs)
Optimize Costs & Throughput:
● Bootstrap to initialize/prioritize annotation
● Use Temporal Information
● Use any available information
● Preprocess to simplify tasks
● Build your own annotation infrastructure or 3rd party?

Step 4: Research Evaluation Metric
Thus far we have:
1. Product requirements
2. Initial Data collection + annotation strategy
Before you start your research experiments set a research
evaluation metric (a single number*)
*Andrew Ng

Step 4: Research Evaluation Metric
● There are many ways to evaluate an experiment:
○ e.g. TPR, aDR, FPR, mAP, latency, etc..
○ Improving one metric can lower another one..
● It’s more efficient (time & resources) to advance with a clear target
EvaluationMetric
EvaluationMetric
Time / Resources Time / Resources
requirements achieved requirements achieved
?
Optimizing for a single evaluation metric Optimizing for several evaluation metrics

Steps 1-4 Overview by Example
Smart DoorBell example:
Step 1 - Product Requirements
Step 2 - Data Collection:
● 720p RGB Videos, 1.5m heigh
● 20% “No Objects” Videos, 80% “With Objects” Videos
Step 3 - Data Annotation:
● Annotate only objects up to 1.5m away
● Full body + Upper body only bounding boxes
● Annotate 5% Full Videos, 95% Sampled videos
Step 4 - Evaluation Metric
● 98% Recall w/(1FPPW, <500msecs) for Object Detection Task

Step 5: Research
Research
Experiments
Error Analysis
Data
(collection/annotation)

Step 5: Research
Applied research is an empirical process.
1. Research Experiments Phase:
● Deep Learning architectures
● Learning hyper parameters
● Data manipulations
● Other ...

Step 5: Research
2. Analysis Phase:
● Split data into train / validation / test
● Bias / Variance (on validation and training data) *
● Rank the factors that impact our evaluation metric the most (on validation data)
Feature %Error Priority
Wrong Annotation 25% 1
Close Objects 20% 2
Truncated Objects 3% 3
Umbrellas 1% 4
... ... ...
** https://kevinzakka.github.io/2016/09/26/applying-deep-learning/

Step 5: Research
3. Data Phase:
● Clean Data / Re annotate / Change Annotation Scheme
● Data Collection

Step 5: Research
Next Iteration:
● What to explore / fix next in our Deep Learning models

Step 5: Research
The research phase is very resource demanding:
● Researchers
● Compute
● Time
Optimize this in order to shorten time to product:
● Researchers → Increase productivity
● Compute → reduce costs

Step 5: Research
Running an experiment involves:
● Planning the experiment (ok)
● Setting up a compute environment (oh)
● Data selection, preprocessing, fetching (oh)
● Monitoring Periodic evaluation (oh)
● Managing a pipeline of algorithms (oh)
● Saving intermediate results (oh)

Step 5: Research
Running many experiments doesn’t scale up!:
● Managing compute environment resources and prioritization
● Monitoring many experiments
● Analysing many experiments results
● Experiment traceability over time: code, data, experiment configuration versioning
A dedicated infrastructure and management system is needed to:
● Shared resources management
● Orchestrate the training of the different models
● Monitoring the various experiments, training configurations, models
● Allowing to build complicated algorithms pipelines and running effortlessly
Build your own or use 3rd party?

Topics Covered
1. Requirements
3. Data Annotation
4. Research Evaluation Metric
5. Research
2. Data Collection

We are always looking for new talents
Passionate about AI and want to explore more ?
We invite to join us on our journey!
For Jobs opportunities:
https://www.linkedin.com/company/brodmann17/

THANK YOU
Amir Alush, Phd - Co founder & CTO
amir@brodmann17.com

5 Practical Steps to a Successful Deep Learning Research

More Related Content

What's hot

Similar to 5 Practical Steps to a Successful Deep Learning Research

More from Brodmann17

Recently uploaded

5 Practical Steps to a Successful Deep Learning Research