This session walks you through how our interns took some video from a drone and turned it into an Android App to count cars in a parking lot. This is a practical introduction to drone SDKs, Tensorflow and how to combine the two to do object detection on your Android phone from a drone.
2. Godfrey Nolan
2
• President of RIIS LLC
- Drone software and hardware services company in Detroit MI
- Booth #809
• Author of 6 books on mobile development and security
• Adjunct Professor at University of Detroit Mercy
teaching mobile development
April 6, 2022
3. What are we talking about
3
AI =Artificial Intelligence
ML =Machine Learning
CV =Computer Vision
April 6, 2022
4. What are we talking about
4
April 6, 2022
Traditional
Programming
Machine
Learning
Answer
Data
Data
Answer
Rules
Rules
Going to talk about how we put together a simple AI drone application for counting cars in a parking lot in little over a week
AI = using computers to do things that would normally require some human intelligence
ML is a specific type of AI, used to identify patterns and predict an outcome on large amounts of data, classic example would be predicting if you were male and in third class or steerage would you survive on the Titanic assuming we trained our ML model on data from the ships manifest and survival records
CV = computer vision, replicating human vision but with computers, such as video from a drone camera
Difference between ML and traditional programming which is more like a cooking recipe
Instead of telling it how to find the answer, we give it a whole bunch of answers and ask it to figure out the rules
In our drone example we’re going to be looking at counting cars in a parking lot
So what we do is show the computer a ton of images of cars in the input layer and then also tell the output layer that it was just looking at a car
We provide it with the answer in the hope that it will give us the same answer ‘car’ when we show it a completely different picture of a car which it’s never seen before
What do we mean by AI/Ml
Obstacle avoidance
Tracking – Follow me
Navigation - Autonomous flight
Drones together with AI are good at counting, measuring, identifying things or reading text and it doesn’t matter if what you’re looking at is moving or not.
In a few years expect to see a large number of outdoor and indoor manual tasks being automated by drones. Most of the work it will replace are mundane, labor intensive tasks more suited to computers. Ultimately, this will free up humans to do much more interesting work
Ever year we take a half dozen interns and teach them how to create mobile apps for drones
This year we took it a step further and added some machine learning to the tasklist
Because they’re only with us for a few months, and they’ve already got up to speed on the drone SDKs we had to do the AI part in less than a month
So we really looked at how to simplify things, we couldn’t spend weeks configuring cloud computers
We had to plug in the images, run the training and quickly deploy it to a mobile phone
There are 4 steps in creating our parking lot app
Labeling means literally putting a box around each to the things you’re trying to label, it’s an incredibly manual and boring process, but we have a great shortcut
Training is the action of presenting all those labeled images to your neural network, we use Google’s Tensorflow for our training, but you could use Facebook’s Pytorch
Deploy means getting it onto a phone, we use Tensorflow lite for that and this year it got incredibly easy to deploy
Testing means finding out about all those things you didn’t do right, like not taking pictures on a rainy day, or from the wrong angle, and then going back to step 1
We chose counting cars as it’s really easy to get drone video of parking lots, this is not usually the case
If you had to label 5000 images with multiple cars you would lose your mind
So we label about 100 and then use the label assist to try and guess the cars and empty spaces in the remaining 4900 images
It works surprisingly well, and labeling now becomes more about fixing the existing labels rather than creating them
My current favorite tool is roboflow, because it simplifies a lot of the tedious tasks like label assist,
and you can get it to do a web app or use your webcam at the push of a button.
The Hive.ai is another alternative.
Goolge’s Colab is an in browser machine learning platform, so it has no configuration
Back when I was in college you would have to write the code to do the object detection but now all these models are provided for you
If you pay an extra $10/month, they’ll even throw in dedicated access to TPU’s for faster training
Still have to run it overnight but it requires little or no configuration.
Otherwise it’ll take a few weeks to configure your machine to work with the cloud engines like Google Cloud Platform or Azure
You are going to need some basic understanding of Python to work with the colab
Until a year ago, it was a pretty complicated task to get the trained model into your app
But Google have a new tool called model maker which generates all the code that binds your video to your TensorFlow Lite code.
What was a week’s long course on Udacity is now a 20 min blog post.
And this is the code that detects the cars. We split up the incoming drone camera video into frames or images,
pass it to our neural network and then if the model is 65% sure that it’s a car then we draw a box around it
This is Kotlin for Android phones
Take it out for a flight and see how you do
Not enough images / video to label, start with 5000, that’s why we chose cars in a parking lot as our example
Be creative on where you get your video or images, talked to someone recently who was using webcams from Time Square to get lots of different pictures of people and cars
Or you might need to create a simulated environment to train the models. Even synthetic images can work surprisingly well to train a model which is later used in the real world
Animals move, so you may end double counting, use a thermal camera and take the pictures at night if you can
Limit the conditions, go specific not general
Know when to go VTOL or if the object is big enough satellite images also work, there’s a white paper by Maxor about how they counted elephants using satellite imagery
Ethical issues, we don’t do any training using pictures of people’s faces.