• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Vision-Based Indoor Localization for Unmanned Aerial Vehicles

Vision-Based Indoor Localization for Unmanned Aerial Vehicles






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds


Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Vision-Based Indoor Localization for Unmanned Aerial Vehicles Vision-Based Indoor Localization for Unmanned Aerial Vehicles Document Transcript

    • Vision-Based Indoor Localization for Unmanned Aerial Vehicles Jeong-Oog Lee1; Taesam Kang2; Keun-Hwan Lee3; Sung Kyu Im4; and Jungkeun Park5Abstract: Small unmanned aerial vehicles are cost-effective and easy to operate, and especially suitable in dangerous indoor environments.However, because GPS is not available in an indoor environment, indoor localization is a crucial problem in developing small unmannedaerial vehicles (UAVs). This paper suggests vision-based indoor localization for UAVs in GPS-denied environments. Our approach isbased on image matching by applying the scale invariant feature transform algorithm. DOI: 10.1061/(ASCE)AS.1943-5525.0000064.© 2011 American Society of Civil Engineers.CE Database subject headings: Spacecraft; Localization; Global positioning systems; Algorithms.Author keywords: Unmanned aerial vehicle; Scale invariant feature transform; Vision-based localization.Introduction landmarks, artificial landmarks are not adequate for unknown envi- ronments where there are no preinstalled devices for localization.There are increasing applications for unmanned aerial vehicles This paper discusses ways to achieve natural landmarks-based(UAVs) in diverse areas including aerial crop-dusting, installing localization using a vision system for the indoor navigation of apower transmission lines, aerial surveillance, aerial photography, UAV. Vision-based solutions are portable, compact, cost-effective,and rescue operations. Various types of UAV systems are used and power-efficient. Also, they do not emit light or radio signalsdepending on the missions and flight environments. In urban or (Celik et al. 2008). Scale invariant feature transform (SIFT) is aindoor environments, rotor-based UAVs are preferred to fixed-wing vision-based feature detection method. SIFT is scale invariantUAVs, because they provide hovering capability. This unique because it identifies features by building an image pyramid withoperating characteristic is effective in tracking or searching for different image scales. In addition, SIFT is insensitive to the imageobjects that are required for the autonomous flight of UAVs. orientation because it generates feature descriptor vectors based on Recently, much UAV research has focused on small UAVs the reference orientation of feature points.because they are cost-effective and easy to operate. In particular, The method first extracts feature points from the image datathey are suitable in dangerous indoor environments, for example, taken by a monocular camera using the SIFT algorithm. The systembuildings that are on fire or places that have been attacked by selects landmark feature points that have distinct descriptor vectorsterrorists, where human entry is limited. among the feature points. Then, the locations of landmarks are The major technologies for autonomous flight include map calculated and stored in a map database. Based on the landmarkbuilding, localization, path planning, and obstacle avoidance. information, the current position of the UAV is estimated.Indoor localization is a critical issue in developing small UAVs This paper is organized as follows: first, a brief survey ofbecause GPS is not available in an indoor environment. Also, small quad-rotor vehicles is presented. Then, the SIFT algorithm andUAVs cannot accommodate a large number of sensor devices description of the map building method using visual landmarksbecause of their payload limitations. Localization can be achieved is introduced. Next, the position estimation method of UAV isby using natural or artificial landmarks. Artificial landmark-based presented followed by the experimental results. Finally, the conclu-localization uses devices such as ultrasonic beacons that are added sions are given.at fixed locations in the flight environment. Even though usingartificial landmarks is rather simple compared with using natural Related Work 1 Dept. of Aerospace Information Engineering, Konkuk Univ., Seoul,Korea. A miniature unmanned aerial vehicle (MUAV) has some advan- 2 Dept. of Aerospace Information Engineering, Konkuk Univ., Seoul, tages when used for aerial surveillance in complex indoor environ-Korea. ments like office buildings, museums, and commercial centers. 3 MUAVs also play a major role in search and rescue missions when Dept. of Aerospace Information Engineering, Konkuk Univ., Seoul,Korea. earthquakes, explosions, and other disasters break out (Valavanis 4 Dept. of Aerospace Information Engineering, Konkuk Univ., Seoul, 2007). An MUAV that is capable of flying in narrow indoor spacesKorea. can quickly search for victims or inspect disaster areas. With locali- 5 Dept. of Aerospace Information Engineering, Konkuk Univ., Seoul, zation capability, an MUAV can obtain the coordinates of victimsKorea (corresponding author). E-mail: parkjk@konkuk.ac.kr and send that information to rescuers, to guide and assist them in Note. This manuscript was submitted on October 4, 2009; approved onMay 20, 2010; published online on June 5, 2010. Discussion period open completing their rescue missions.until December 1, 2011; separate discussions must be submitted for indi- Although fixed-wing UAVs are generally stable compared withvidual papers. This paper is part of the Journal of Aerospace Engineering, rotor-based UAVs, hovering capability allows the rotor-based UAVVol. 24, No. 3, July 1, 2011. ©ASCE, ISSN 0893-1321/2011/3-373–377/ to remain in place when needed, to fly closer to objects of concern,$25.00. and to maneuver in ways that a fixed-wing UAV cannot (Fowers JOURNAL OF AEROSPACE ENGINEERING © ASCE / JULY 2011 / 373
    • et al. 2008). In Pradana et al. (2009), researchers introduced H ∞ The following are the major steps for image matching in the SIFTsynthesis of MIMO integral-backstepping PID control to stabilize a algorithm.model of UAV in hovering conditions subject to parametric uncer- • Feature points extraction: research by Koenderink (1984) andtainty stability derivatives and wind gust disturbances. Lindeberg (1994) has shown that the only possible scale-space The autonomous systems lab (ASL) of the Swiss Federal kernel is the Gaussian function. After calculating the GaussianInstitute of Technology introduced a quad-rotor simulation model function, maximums and minimums of the difference offor OS4 that embeds all necessary avionics and energy devices, Gaussian images can be found by using the neighbor pixelsincluding a low cost inertial measurement unit (IMU), a vision- and adjacent scales. Those pixels with maximums or minimumsbased position sensor, and an obstacle detection setup. While can be scale invariant features.ASL gave useful comparisons of various control methodologies, • Excluding feature points with unstable extrema: once featurethey did not present comprehensive control strategies for position points have been selected by comparing neighbor pixels, thosehold, nor did they present convincing results where a maintainable features with unstable extrema should be detected and rejected.hover was obtained (Johnson 2008; Bouabdallah et al. 2004). The features with unstable extrema often have low contrast, and The X-4 Flyer quad-rotor Micro Air Vehicle was introduced by therefore, are sensitive to noise. An approach developed bythe Australian National University. In this project, the dynamics of Brown and Lowe (2002), which uses a Taylor expansion ofquad-rotor helicopters with blade flapping were studied. The X-4 is the scale-space function and a Harris corner function, can besomewhat heavier than other quad-rotor helicopters. It has a total used for accurate keypoint localization.weight of 4 kg and is designed to carry a 1 kg payload (Pounds • Orientation assignment to each feature point: assigning a con-et al. 2006). sistent orientation to each keypoint based on local image proper- The Stanford Testbed of Autonomous Rotocraft for Multi-Agent ties allows the keypoint descriptor to be represented relative toControl (STARMAC) quad-rotor helicopter was used to investigate this orientation. Therefore, invariance to image rotation can bethe impact of aerodynamic effects on attitude and altitude control achieved. An orientation histogram is formed from the gradient(Hoffmann et al. 2007). The STARMAC was equipped with an orientations of sample points within a region around the key-IMU, a sonic-ranging sensor, and a DGPS for full state estimation. point and has 36 bins covering the 360° range of orientations.The vehicle is capable of carrying a payload of up to the 2.5 kg and After finding the highest peak in the histogram, we can find anyis designed to perform multiagent missions, such as cooperative other local peak that is within 80% of the highest peak. The localsearch and rescue. peak is also used to create a keypoint. Under the payload limitations of small UAVs, vision is a cost- • Computation of the keypoint descriptor: after assigning aneffective way to localize UAVs and detect obstacles because feature orientation to each keypoint, a keypoint descriptor can be cre-points of the flying environment can be extracted from images ated by computing the gradient magnitude and orientation attaken by the UAV’s camera. Vision is even used to send and receive each image sample point in the region around the keypointsome information (Kang et al. 2009). Kemp (2006) introduced a location. Descriptor vectors created with this method can bemethod for localization of a quad-rotor based on edges that can used for keypoint matching.be obtained from image feature points. Because of the huge calcu- • Keypoint matching: each keypoint can be matched by calculat-lation size for image processing and probability filters, the images ing the Euclidean distance between descriptor vectors. Becausefrom the camera are sent by a wireless link to a desktop computer the dimension of the descriptor vector generally exceeds 10 andthat calculates all the computations necessary for localization. the K-dimensional (K-D) tree alone is not adequate to find theHurzeler et al. (2008) also proposed a teleoperation scheme in best match, an approximate algorithm, called the best-bin-firstwhich a base station remotely controls attitude of small UAVs using (BBF) algorithm has been used.wireless LAN in an indoor environment. Fig. 1 shows the matching of invariant features between two The Robotic Vision Lab at Brigham Young University has different images. Each match feature is connected by a line.developed a quad-rotor system that is equipped with a camerasensor; system estimated horizontal move and corrected drift frominertial measurement units using features on the floor (Fowers et al. Getting Landmark Feature Points2008). The developed system was based on FPGA and was able toanalyze and process images without the help of a desktop computer. To build a map of a flight environment, the position of the identifiedHowever, because the system only detects downward, it has diffi- feature points is calculated using projection geometry. Fig. 2 showsculty detecting obstacles located on the pathway and can be usedonly when the scene has specific patterns.SIFT AlgorithmIn image matching, image pixels or points that can be used formatching are called features or keypoints. The features shouldbe invariant to image scaling and rotation, and can be describedby description vectors. Image matching is done by calculatingthe similarity between description vectors. There are some methodsthat use image features for image matching. One of these methodsis the SIFT approach. SIFT transforms an image into a number of feature points,each of which is invariant to image translation, scaling, androtation and is partially invariant to illumination changes and Fig. 1. Matching of invariant features between different imagesthe three-dimensional (3D) camera viewpoint (Lowe 2004).374 / JOURNAL OF AEROSPACE ENGINEERING © ASCE / JULY 2011
    • For a special case, it can be assumed that ~z ¼ ~ This means that n z. the camera is parallel to the ground. Then, the other two camera coordinate vectors ~x and ~y are expressed as follows n n pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ~x ¼ a~ þ b~ n x y; ~y ¼ Àb~ þ a~ where n x y a2 þ b2 ¼ 1 ð4Þ By applying Eq. (4), Eqs. (2) and (3) can be written as b1 ðxF À x1 Þ À a1 ðyF À y1 Þ xp1 ¼ f; a1 ðxF À x1 Þ þ b1 ðyF À y1 Þ ð5Þ ðzF À z1 Þ yp1 ¼ f a1 ðxF À x1 Þ þ b1 ðyF À y1 Þ Fig. 2. Projection of a landmark feature point (PF ) b2 ðxF À x2 Þ À a2 ðyF À y2 Þ xp2 ¼ f; a2 ðxF À x2 Þ þ b2 ðyF À y2 Þthe projection of a feature point into the projection plane where ð6ÞPp = projection of the feature point PF with respect to the projection ðzF À z2 Þ yp2 ¼ freference point R. Three different coordinates were used: world a2 ðxF À x2 Þ þ b2 ðyF À y2 Þcoordinates (~ ~ and ~ camera coordinates (~x ; ~y ; and ~z ), x; y; z), n n nand projection plane coordinates (~x ; ~y ). PF and R are described p p Only xF, yF , and zF are unknowns in Eqs. (5) and (6) since cam-in the world coordinates, and Pp is described in the projection plane era location, camera direction, and feature point locations in imagescoordinate. The unit vector ~x = camera direction that is normal to n can be measured in the map building time.the projection plane. R = camera location and f = focal length of thecamera. In this paper, a Logitech Quickcam E3500 camera with ameasured focal length of f x = 678.6 pixels and f y = 679.0 pixels UAV Position Estimationwas used. Once the map of the flight environment is built, the current position Using the triangle proportionality, the authors can easily derive of the UAV can be estimated using feature points in an image takenthe location of Pp , as follows by the UAV. The estimation of the UAV position is similar to the f  ƒ ƒ! ƒ  ƒ! mapping of feature points that is based on Eq. (1). However, known Pp ðxp ; yp Þ ¼ ƒƒ! À~y · RPF ;~z · RPF n n ð1Þ and unknown variables are reversed. In the estimation of the UAV ~x · RPF n position, the equations for the UAV position and attitude are calcu- lated (~x , ~y , ~z ) as unknown variables. Because there are six n n nIn the previous vector equation, the authors can find two equations unknown variables, six independent equations are needed thatfor xp and yp . However, three independent equations are needed to can be derived using three feature point projection equations. Con-calculate the location of the feature point PF ðxF ; yF ; zF Þ. This can sequently, if three distinct feature points in the camera image arebe obtained using two different images that are taken at two differ- found, then the position and attitude of the UAV can be calculated.ent locations but contain the same feature point, as shown in Fig. 3. Generally, attitude of an aircraft is described as Euler angles: Then, the following two vector equations are obtained yaw, pitch, and roll. Thus, Eq. (1) can be rewritten using Euler f  ƒƒ! ƒ ƒƒ! ƒ angles. To do this, the authors first transform the coordinates based Pp1 ðxp1 ; yp1 Þ ¼ ƒƒ! À~y1 · R1 PF ;~z1 · R1 PF ƒ n n ð2Þ on the following formula ~x1 · R1 PF n 2 3 2 32 3 X cos ψ sin ψ 0 cos θ 0 À sin θ 6 7 6 76 7  4 Y 5 ¼ 4 À sin ψ cos ψ 0 54 0 0 5 ƒƒ! 1 f ƒƒ! ƒ ƒ Pp2 ðxp2 ; yp2 Þ ¼ ƒƒ! ƒ À~y2 · R2 PF ;~z2 · R2 PF n n ð3Þ Z 0 0 1 sin θ 0 cos θ ~x2 · R2 PF n 2 32 3 1 0 0 x In Eqs. (2) and (3), there are four independent equations. 6 76 7 × 40 cos ϕ sin ϕ 54 y 5 ð7ÞBy solving these equations, the location of feature pointPF ðxF ; yF ; zF Þ can be calculated. 0 À sin ϕ cos ϕ z Fig. 3. Two different images having the same feature point JOURNAL OF AEROSPACE ENGINEERING © ASCE / JULY 2011 / 375
    • Fig. 4. Incremental estimation of the locations of the feature points; A, B, and C represent the UAV positionswhere ψ = yaw angle; θ = pitch angle; ϕ = roll angle; ðx; y; zÞ = Experimental Resultsposition in zero angle coordinate, and ðX; Y; ZÞ = position in air-craft reference coordinates, which is equal to the camera coordi- In this section the authors present the experimental results obtainednates. Then, Eq. (1) can be rewritten as after applying the proposed localization and mapping methods. The use of an AHRS gives us UAV position information such as yaw, xR ðxp cos ψ cos θ À f sin ψ cos θÞ pitch, and roll angles. Using a vision sensor, the authors were able to collect a number of feature points that were used as natural land- þ yR fðxp sin ψ cos ϕ þ cos ψ sin θ sin ϕÞ marks. The experiment was conducted in a 7:5 m × 5:5 m environ- þ f ðcos ψ cos ϕ À sin ψ sin θ sin ϕÞg ment and the flight trajectory was a combination of straight flight, þ zR fxp ðsin ψ sin ϕ À cos ψ sin θ cos ϕÞ turning, elevation and descent. First, two different images from the vision sensor were captured. Using the SIFT algorithm, the system þ f ðcos ψ sin ϕ þ sin ψ sin θ cos ϕÞg found the same feature points in both images. Next, the locations of ¼ xf ðxp cos ψ cos θ À f sin ψ cos θÞ the feature points were calculated by using Eqs. (2) and (3). Once the locations of the landmarks were found, and the authors þ yf fðxp sin ψ cos ϕ þ cos ψ sin θ sin ϕÞ moved to the next position, the newly moved camera (vehicle) þ f ðcos ψ cos ϕ À sin ψ sin θ sin ϕÞg location can be calculated by using Eqs. (8) and (9). In this way, all the locations of all the landmarks can be calculated. Also, þ zf fxp ðsin ψ sin ϕ À cos ψ sin ϕ cos ϕÞ camera locations can be identified during the mapping process. þ f ðcos ψ sin ϕ þ sin ψ sin ϕ cos ϕÞg ð8Þ Fig. 4 shows this concept schematically. Thirty-five positions were predefined as UAV locations. These were calculated from predefined UAV locations from the natural landmarks obtained during the mapping process. Then, UAV loca-xR ðyp cos ψ cos θ À f sin θÞ tions were compared and calculated with actual UAV locations þ yR fðyp sin ψ cos ϕ þ cos ψ sin θ sin ϕÞ þ f cos θ sin ϕg measured manually. The experiment was performed six times on the same trajectory. Fig. 5 shows the average of the calculated þ zR fyp ðsin ψ sin ϕ À cos ψ sin θ cos ϕÞ À f cos θ cos ϕg¼ xf ðyp cos ψ cos θ À f sin θÞ þ yf fðyp sin ψ cos ϕ þ cos ψ sin θ sin ϕÞ þ f cos θ sin ϕg þ zf fyp ðsin ψ sin ϕ À cos ψ sin θ cos ϕÞ À f cos θ cos ϕg ð9ÞUsing Eqs. (8) and (9), to estimate the position and attitude of theUAV, three distinct feature points in a camera image can be iden-tified. However, solving these equations requires a nonlinearmethod like the Newton method and that estimation error canbe significant, depending on the selection of feature points. The estimation method uses a combination of vision and attitudeand heading reference system (AHRS) in estimating the UAVposition. Since AHRS gives relatively exact attitude and headinginformation, the attitude information from AHRS is used insteadof solving from Eqs. (8) and (9). Then, only the UAV positioninformation remains unknown. Thus, the two feature points in the Fig. 5. Difference between calculated UAV locations and actual UAVcamera image can be used to estimate the current position of locationsthe UAV.376 / JOURNAL OF AEROSPACE ENGINEERING © ASCE / JULY 2011
    • A quad-rotor system is currently in development that uses a vision sensor and AHRS to test the algorithm in a real environment. Also, the method will be extended to vision-based SLAM. Acknowledgments This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science, and Technology (2010-0011851). References(a) Bouabdallah, S., Murrieri, P., and Siegwart, R. (2004). “Design and control of an indoor micro quadrotor.” Proc., IEEE Int. Conf. on Robotics and Automation, ICRA. Brown, M., and Lowe, D. G. (2002). “Invariant features from interest point groups.” Proc., British Machine Vision Conf., BMVA. Celik, K., Chung, S.-J., and Somani, A. K. (2008). “MVCSLAM: Mono-vision corner slam for autonomous micro-helicopters in GPS denied environments.” AIAA Guidance, Navigation and Control Conf. and Exhibit, AIAA, Reston, VA. Fowers, S. G., Tippetts, B. J., Lee, D. J., and Archibald, J. K. (2008). “Vision-guided autonomous quad-rotor helicopter flight stabilization and control.” AUVSI Unmanned Systems, AUVSI, San Diego. Hoffmann, G. M., Huang, H., Waslander, S. L., and Tomlin, C. J. (2007). “Quadrotor helicopter flight dynamics and control: theory and experi- ment.” Proc., AIAA Guidance, Navigation and Control Conf. and Exhibit, AIAA.(b) Hurzeler, C., et al. (2008). “Teleoperation assistance for an indoor quadrotor helicopter.” Proc., Int. Conf. on Simulation, Modeling, andFig. 6. (a) Localization error; (b) variance of the calculated UAV Programming for Autonomous Robots.locations Johnson, N. G. (2008). “Vision-assisted control of a hovering air vehicle in an indoor setting.” M.S. thesis, Dept. of Mechanical Engineering, Brigham Young Univ..UAV locations and the actual UAV locations in 3D space. The solid Kang, J., Jeon, J., Lee, Y., and Jeong, T. T. (2009). “Vision-based comput-line represents the calculated UAV locations and the dotted line the ing approach and modeling for wireless network.” IETE Tech. Rev.,actual UAV locations. Fig. 6(a) shows the localization error, and 26(6), 394–401.Fig. 6(b) shows the variance of the calculated UAV locations. Kemp, C. (2006). “Visual control of a miniature quad-rotor helicopter.”The maximum localization error is 55 cm, which is small enough Ph.D. thesis, Dept. of Engineering, Univ. of Cambridge. Koenderink, J. J. (1984). “The structure of images.” Biol. Cybern., 50(5),for the UAV to engage in flight in an indoor environment. The vari- 363–370.ance at positions 14 and 15 is relatively large because there are few Lindberg, T. (1994). “Scale-space theory: A basic tool for analyzing stric-matching feature points at these positions. tures at different scales.” J. Appl. Stat., 21(1), 225–270. Lowe, D. G. (2004). “Distinctive image features from scale-invariant keypoints.” Int. J. Comput. Vis., 60(2), 91–110.Conclusion Pounds, P., Mahony, R., and Corke, P. (2006). “Modeling and control of a quad-rotor robot.” Proc., Australasian Conf. on Robotics andThis paper proposes a vision-based indoor localization method for Automation, Auckland, New Zealand.UAVs. Based on the SIFT algorithm, this method finds the current Pradana, W. A., Joelianto, E., Budiyono, A., and Adiprawita, W. (2009). “Robust MIMO H∞ integral-backstepping PID controller for hoveringposition of a UAV with simple geometry calculations. This method control of unmanned model helicopter.” Proc., Int. Symp. on Intelligentcan be used in simultaneous localization and mapping (SLAM) Unmanned System, Jeju, Korea.when it is combined with motion sensors such as AHRS and Valavanis, K. P. (2007). “Advances in unmanned aerial vehicles.” Springer,the IMU unit. 171–210. JOURNAL OF AEROSPACE ENGINEERING © ASCE / JULY 2011 / 377
    • Copyright of Journal of Aerospace Engineering is the property of American Society of Civil Engineers and itscontent may not be copied or emailed to multiple sites or posted to a listserv without the copyright holdersexpress written permission. However, users may print, download, or email articles for individual use.