Artificial Intelligence Applied to Your Drone
I noticed that drones have become very popular for both business and personal use. When I say drones, I mean quadcopters. I admit, they’re pretty cool and fun to play with. They can take amazing videos and photos from high altitudes that would otherwise be difficult or costly. As cool as they are, the majority of the consumer market uses it for one purpose: a remote control camera. What a waste! These quadcopters are full of potential and all you can do with it is take high-tech selfies and spy on neighbors? Screw that. I’m going to get my hands on some drones and make them do more.
I researched drones from different manufacturers and decided to get the one that is most hacker-friendly: The Parrot AR Drone. The Parrot AR Drone isn’t the most expensive or fancy, but it packs the most punch in terms of hackability. Unlike the radio frequency channel drones (which do allow you to fly at greater distances), the AR Drone is one of few that operate over wifi signals. Why do I prefer wifi? This means that the drone acts as a floating wireless access point and signals are transmitted using TCP or UDP protocols which can be replicated with your laptop or any device that is capable of connecting to a wifi access point. Among the available wifi drones, I chose the Parrot AR Drone because (as far as I know) it is the only drone with a documented API and open source SDK for you engineers that would like to do more than take aerial photos of your roof.
A quick google search returned several AR Drone SDKs supporting a handful of different programming languages. Some are just wrappers around the official Parrot C SDK while others are a complete rewrite which directly calls the actual API (which is also well documented). This makes it much easier than I initially thought!
The first SDK I tried was python-ardrone which is written completely in Python. It’s actually very easy to use and even includes a demo script that allows you to manually control your drone with your computer keyboard. The only thing I disliked about it was its h264 video decoder. The included h264 video recorder pipes the h264 video stream to ffmpeg and waits for it to send raw frame data back. It takes that data and converts it into numPy arrays and then converts the numPy arrays into a PyGame surface. I had a hard time getting a video feed and when I got it, the feed was too slow to be of any use. I would love to play with it some more and figure out a fix for the video. Here is a video of me operating the drone using my laptop with the python-ardrone library.
The next SDK I tried was the AR Drone Autopylot. The Autopylot library is written in C and requires the official Parrot SDK, but provides you with a way to implement your own add-ons in C, Python, or Matlab. It also allows you to manually control your drone with a PS3 or Logitech gamepad. I’m not sure how I feel about this as I wish it would include a way to navigate your drone with a keyboard. However, the h264 video decoder works really well, and that’s the most important requirement for this project. Since Autopylot gives me a working video feed, that’s what I decided to work with.
As the first step to making an intelligent drone, I want to make my drone hover in the air and follow people. While this does not make my drone “intelligent”, the ability to apply computer vision algorithms plays a huge role in that. Thanks to friendly SDKs like Autopylot and python-ardrone, this is actually pretty simple.
You may or may not have read my old blog post, My Face Tracking Robot, but in that post, I describe how I made my OpenCV library based face-tracking robot (or turret). All I have to do is apply the same haar cascade and CV logic to the Python drone SDK and I’m done!
Here is my first implementation:
#!/usr/bin/python # file: /opencv/face_tracker.py import sys import time import math import datetime import serial import cv # Parameters for haar detection # From the API: # The default parameters (scale_factor=2, min_neighbors=3, flags=0) are tuned # for accurate yet slow object detection. For a faster operation on real video # images the settings are: # scale_factor=1.2, min_neighbors=2, flags=CV_HAAR_DO_CANNY_PRUNING, # min_size=<minimum possible face size min_size = (20,20) image_scale = 2 haar_scale = 1.2 min_neighbors = 2 haar_flags = 0 # For OpenCV image display WINDOW_NAME = 'FaceTracker' def track(img, threshold=100): '''Accepts BGR image and optional object threshold between 0 and 255 (default = 100). Returns: (x,y) coordinates of centroid if found (-1,-1) if no centroid was found None if user hit ESC ''' cascade = cv.Load("haarcascade_frontalface_default.xml") gray = cv.CreateImage((img.width,img.height), 8, 1) small_img = cv.CreateImage((cv.Round(img.width / image_scale),cv.Round (img.height / image_scale)), 8, 1) # convert color input image to grayscale cv.CvtColor(img, gray, cv.CV_BGR2GRAY) # scale input image for faster processing cv.Resize(gray, small_img, cv.CV_INTER_LINEAR) cv.EqualizeHist(small_img, small_img) center = (-1,-1) #import ipdb; ipdb.set_trace() if(cascade): t = cv.GetTickCount() # HaarDetectObjects takes 0.02s faces = cv.HaarDetectObjects(small_img, cascade, cv.CreateMemStorage(0), haar_scale, min_neighbors, haar_flags, min_size) t = cv.GetTickCount() - t if faces: for ((x, y, w, h), n) in faces: # the input to cv.HaarDetectObjects was resized, so scale the # bounding box of each face and convert it to two CvPoints pt1 = (int(x * image_scale), int(y * image_scale)) pt2 = (int((x + w) * image_scale), int((y + h) * image_scale)) cv.Rectangle(img, pt1, pt2, cv.RGB(255, 0, 0), 3, 8, 0) #cv.Rectangle(img, (x,y), (x+w,y+h), 255) # get the xy corner co-ords, calc the center location x1 = pt1 x2 = pt2 y1 = pt1 y2 = pt2 centerx = x1+((x2-x1)/2) centery = y1+((y2-y1)/2) center = (centerx, centery) cv.NamedWindow(WINDOW_NAME, 1) cv.ShowImage(WINDOW_NAME, img) if cv.WaitKey(5) == 27: center = None return center if __name__ == '__main__': capture = cv.CaptureFromCAM(0) while True: if not track(cv.QueryFrame(capture)): break
couple that script with this replacement autopylot_agent.py:
''' Python face-tracking agent for AR.Drone Autopylot program... by Cranklin (http://www.cranklin.com) Based on Simon D. Levy's green ball tracking agent Copyright (C) 2013 Simon D. Levy This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. You should also have received a copy of the Parrot Parrot AR.Drone Development License and Parrot AR.Drone copyright notice and disclaimer and If not, see <https://projects.ardrone.org/attachments/277/ParrotLicense.txt> and <https://projects.ardrone.org/attachments/278/ParrotCopyrightAndDisclaimer.txt>. ''' # PID parameters Kpx = 0.25 Kpy = 0.25 Kdx = 0.25 Kdy = 0.25 Kix = 0 Kiy = 0 import cv import face_tracker # Routine called by C program. def action(img_bytes, img_width, img_height, is_belly, ctrl_state, vbat_flying_percentage, theta, phi, psi, altitude, vx, vy): # Set up command defaults zap = 0 phi = 0 theta = 0 gaz = 0 yaw = 0 # Set up state variables first time around if not hasattr(action, 'count'): action.count = 0 action.errx_1 = 0 action.erry_1 = 0 action.phi_1 = 0 action.gaz_1 = 0 # Create full-color image from bytes image = cv.CreateImageHeader((img_width,img_height), cv.IPL_DEPTH_8U, 3) cv.SetData(image, img_bytes, img_width*3) # Grab centroid of face ctr = face_tracker.track(image) # Use centroid if it exists if ctr: # Compute proportional distance (error) of centroid from image center errx = _dst(ctr, 0, img_width) erry = -_dst(ctr, 1, img_height) # Compute vertical, horizontal velocity commands based on PID control after first iteration if action.count > 0: phi = _pid(action.phi_1, errx, action.errx_1, Kpx, Kix, Kdx) gaz = _pid(action.gaz_1, erry, action.erry_1, Kpy, Kiy, Kdy) # Remember PID variables for next iteration action.errx_1 = errx action.erry_1 = erry action.phi_1 = phi action.gaz_1 = gaz action.count += 1 # Send control parameters back to drone return (zap, phi, theta, gaz, yaw) # Simple PID controller from http://www.control.com/thread/1026159301 def _pid(out_1, err, err_1, Kp, Ki, Kd): return Kp*err + Ki*(err+err_1) + Kd*(err-err_1) # Returns proportional distance to image center along specified dimension. # Above center = -; Below = + # Right of center = +; Left = 1 def _dst(ctr, dim, siz): siz = siz/2 return (ctr[dim] - siz) / float(siz)
Now autopylot_agent simply looks for a “track” method that returns the center coordinates of an object (in this case, a face) and navigates the drone to follow it. If you noticed, I’m using the frontal face haar cascade to detect the front of a human face. You can easily swap this out for a haar cascade to a profile face, upper body, eye, etc. You can even train it detect dogs or other animals or cars, etc. You get the idea.
This works fine the way it is, however, I felt the need to improve upon the autopylot_agent module because I want the drone to rotate rather than strafe when following horizontal movement. By processing the “err_x” as a “yaw” rather than a “phi”, that can be fixed easily. Also, rather than just returning the centroid, I decided to modify it to return the height of the tracked object as well. This way, the drone can move closer to your face by using the “theta”.
On my first test run with the “theta” feature, the drone found my face and flew right up to my throat and tried to choke me. I had to recalibrate it to chill out a bit.
Here are a couple videos of my drones following me:
Remember… this is all autonomous movement. There are no humans controlling this thing!
You may think it’s cute. Quite frankly, I think it’s a bit creepy. I’m already deathly afraid of clingy people and I just converted my drone into a stage 5 clinger.
This is just the first step to making my drone more capable. This is what I’m going to do next:
- process voice commands to pilot the drone (or access it through Jarvis)
- teach the drone to play hide and seek with you
- operate the drone using google glass or some other FPV
- operate the drone remotely (i.e. fly around the house while I’m at the office)
With a better drone and frame, I’d like to work on these:
- arm it with a nerf gun or a water gun
- have it self-charge by landing on a charging landing pad
I’m also building a MultiWii based drone and, in the process, coming up with some cool project ideas. I’ll keep you updated with a follow-up post when I have something. 🙂