AI/ML in drones
Godfrey Nolan
RIIS LLC
Godfrey Nolan
2
• President of RIIS LLC
- Drone software and hardware services company in Detroit MI
- Booth #809
• Author of 6 books on mobile development and security
• Adjunct Professor at University of Detroit Mercy
teaching mobile development
April 6, 2022
What are we talking about
3
AI =Artificial Intelligence
ML =Machine Learning
CV =Computer Vision
April 6, 2022
What are we talking about
4
April 6, 2022
Traditional
Programming
Machine
Learning
Answer
Data
Data
Answer
Rules
Rules
What are we talking about
5
April 6, 2022
AI/ML in drones
6
April 6, 2022
• Obstacle avoidance
• Tracking
• Navigation
• Real Time Object
Detection
• Image Stitching
• Object counting
Putting theory into practice
7
April 6, 2022
4 steps
8
2 3 4
1
April 6, 2022
Labeling
9
April 6, 2022
Labeling
10
April 6, 2022
11
April 6, 2022
Training
Deploy
12
April 6, 2022
Deploy
13
April 6, 2022
Testing
14
April 6, 2022
Lessons
Learned
15
Never enough input images
Animals move
Limit the conditions
Choose the right hammer for the job
Ethical issues
April 6, 2022

AI/ML in drones

Editor's Notes

  • #3 Going to talk about how we put together a simple AI drone application for counting cars in a parking lot in little over a week
  • #4 AI = using computers to do things that would normally require some human intelligence ML is a specific type of AI, used to identify patterns and predict an outcome on large amounts of data, classic example would be predicting if you were male and in third class or steerage would you survive on the Titanic assuming we trained our ML model on data from the ships manifest and survival records CV = computer vision, replicating human vision but with computers, such as video from a drone camera
  • #5 Difference between ML and traditional programming which is more like a cooking recipe Instead of telling it how to find the answer, we give it a whole bunch of answers and ask it to figure out the rules
  • #6 In our drone example we’re going to be looking at counting cars in a parking lot So what we do is show the computer a ton of images of cars in the input layer and then also tell the output layer that it was just looking at a car We provide it with the answer in the hope that it will give us the same answer ‘car’ when we show it a completely different picture of a car which it’s never seen before
  • #7 What do we mean by AI/Ml Obstacle avoidance Tracking – Follow me Navigation - Autonomous flight Drones together with AI are good at counting, measuring, identifying things or reading text and it doesn’t matter if what you’re looking at is moving or not.  In a few years expect to see a large number of outdoor and indoor manual tasks being automated by drones. Most of the work it will replace are mundane, labor intensive tasks more suited to computers. Ultimately, this will free up humans to do much more interesting work
  • #8 Ever year we take a half dozen interns and teach them how to create mobile apps for drones This year we took it a step further and added some machine learning to the tasklist Because they’re only with us for a few months, and they’ve already got up to speed on the drone SDKs we had to do the AI part in less than a month So we really looked at how to simplify things, we couldn’t spend weeks configuring cloud computers We had to plug in the images, run the training and quickly deploy it to a mobile phone
  • #9 There are 4 steps in creating our parking lot app Labeling means literally putting a box around each to the things you’re trying to label, it’s an incredibly manual and boring process, but we have a great shortcut Training is the action of presenting all those labeled images to your neural network, we use Google’s Tensorflow for our training, but you could use Facebook’s Pytorch Deploy means getting it onto a phone, we use Tensorflow lite for that and this year it got incredibly easy to deploy Testing means finding out about all those things you didn’t do right, like not taking pictures on a rainy day, or from the wrong angle, and then going back to step 1
  • #10 We chose counting cars as it’s really easy to get drone video of parking lots, this is not usually the case If you had to label 5000 images with multiple cars you would lose your mind So we label about 100 and then use the label assist to try and guess the cars and empty spaces in the remaining 4900 images It works surprisingly well, and labeling now becomes more about fixing the existing labels rather than creating them
  • #11 My current favorite tool is roboflow, because it simplifies a lot of the tedious tasks like label assist, and you can get it to do a web app or use your webcam at the push of a button. The Hive.ai is another alternative.
  • #12 Goolge’s Colab is an in browser machine learning platform, so it has no configuration Back when I was in college you would have to write the code to do the object detection but now all these models are provided for you If you pay an extra $10/month, they’ll even throw in dedicated access to TPU’s for faster training Still have to run it overnight but it requires little or no configuration. Otherwise it’ll take a few weeks to configure your machine to work with the cloud engines like Google Cloud Platform or Azure You are going to need some basic understanding of Python to work with the colab
  • #13 Until a year ago, it was a pretty complicated task to get the trained model into your app But Google have a new tool called model maker which generates all the code that binds your video to your TensorFlow Lite code. What was a week’s long course on Udacity is now a 20 min blog post.
  • #14 And this is the code that detects the cars. We split up the incoming drone camera video into frames or images, pass it to our neural network and then if the model is 65% sure that it’s a car then we draw a box around it This is Kotlin for Android phones
  • #15 Take it out for a flight and see how you do
  • #16 Not enough images / video to label, start with 5000, that’s why we chose cars in a parking lot as our example Be creative on where you get your video or images, talked to someone recently who was using webcams from Time Square to get lots of different pictures of people and cars Or you might need to create a simulated environment to train the models. Even synthetic images can work surprisingly well to train a model which is later used in the real world Animals move, so you may end double counting, use a thermal camera and take the pictures at night if you can Limit the conditions, go specific not general Know when to go VTOL or if the object is big enough satellite images also work, there’s a white paper by Maxor about how they counted elephants using satellite imagery Ethical issues, we don’t do any training using pictures of people’s faces.