SlideShare a Scribd company logo
1 of 132
Download to read offline
Design of a Robot Guide/Usher 
Project 10 
Drexel University | 3141 Chestnut Street, Philadelphia, PA 19104 
1 
Adviser 
M. Ani Hsieh 
Design Team 
Tyler Aaron 
John Burchmore 
Ian DeOrio 
Jeff Gao 
Drexel University 
Mechanical Engineering and Mechanics Senior Design 
MEM 493 
Spring 14
2 
Abstract 
Several Mechanical Engineering and Mechanics (MEM) research labs are located within the Science 
Center, but access is restricted only to ID card holders. As a direct result, visitors and a few lab members 
require the assistance of an ID card holder to gain access to the building every visit. While the option of 
employing a dedicated ID card usher is plausible, it just isn’t very feasible and the job itself would be 
very menial. Autonomous robots with the ability to identify and guide human visitors would solve this 
issue. Designed as a case study with broader implications, this project will develop a robotic guide/usher 
to perform this task.
Table of Contents 
Abstract ......................................................................................................................................................... 2 
List of Figures ................................................................................................................................................ 6 
List of Tables ................................................................................................................................................. 8 
Introduction .................................................................................................................................................. 9 
Stakeholders and Needs ............................................................................................................................. 10 
Problem Statement ..................................................................................................................................... 11 
Methodology ............................................................................................................................................... 12 
Background ............................................................................................................................................. 12 
Perception Subsystem: Asus Xtion Pro ................................................................................................... 13 
Mobility Subsystem: iRobot Create ........................................................................................................ 15 
Manipulation Subsystem: Keycard ......................................................................................................... 16 
Control Architecture Subsystem............................................................................................................. 17 
Mapping ............................................................................................................................................. 17 
Background .................................................................................................................................... 17 
Process ........................................................................................................................................... 18 
Localization ......................................................................................................................................... 22 
Theory ............................................................................................................................................ 22 
Background ................................................................................................................................ 22 
Edge/Corner Detection .............................................................................................................. 23 
Blob Detection ........................................................................................................................... 24 
Translational (XYZ) ......................................................................................................................... 25 
Background ................................................................................................................................ 25 
Object Detection ....................................................................................................................... 25 
Implementation ......................................................................................................................... 27 
Bearing ........................................................................................................................................... 29 
Overview .................................................................................................................................... 29 
Flat Wall Assumption ................................................................................................................. 29 
Template Method ...................................................................................................................... 30 
Template Method - Processing ................................................................................................. 31 
Obstacle Detection ............................................................................................................................. 36 
Process ........................................................................................................................................... 36 
Obstacle Avoidance ............................................................................................................................ 39 
Theory ............................................................................................................................................ 39 
Execution........................................................................................................................................ 40 
Elevator Navigation ............................................................................................................................ 42 
3
Process ........................................................................................................................................... 42 
Exit Elevator ................................................................................................................................... 45 
Subject Identification ......................................................................................................................... 46 
Process Explanation ....................................................................................................................... 46 
QR Code System ............................................................................................................................. 47 
Code Decryption ............................................................................................................................ 48 
Design Overview ..................................................................................................................................... 49 
Codes and Standards .......................................................................................................................... 49 
Model Concept ................................................................................................................................... 50 
Design............................................................................................................................................. 50 
Physics ............................................................................................................................................ 52 
Parts List ......................................................................................................................................... 54 
Simulation / Experimental Validation Plan ................................................................................................. 55 
Description of the Test Bed .................................................................................................................... 55 
Bearing Localization ........................................................................................................................... 55 
Translational XYZ Localization ............................................................................................................ 56 
Subject Identification ......................................................................................................................... 56 
Elevator .............................................................................................................................................. 57 
Object Avoidance/Detection .............................................................................................................. 57 
Validation Plan ........................................................................................................................................ 57 
Bearing ............................................................................................................................................... 57 
Translational XYZ ................................................................................................................................ 58 
Subject Identification ......................................................................................................................... 59 
Elevator .............................................................................................................................................. 60 
Object Avoidance/Detection .............................................................................................................. 60 
Validation Data ....................................................................................................................................... 60 
Rotational Localization ....................................................................................................................... 60 
Translational XYZ ................................................................................................................................ 61 
Guest Recognition .............................................................................................................................. 62 
Elevator .............................................................................................................................................. 63 
Obstacle Avoidance ............................................................................................................................ 63 
Context and Impact ..................................................................................................................................... 65 
Economic Analysis .................................................................................................................................. 65 
Environmental Analysis .......................................................................................................................... 66 
Social Impact Analysis ............................................................................................................................. 66 
Ethical Analysis ....................................................................................................................................... 67 
Discussion/Future Work ............................................................................................................................. 68 
4
Conclusions ............................................................................................................................................. 68 
Future Work............................................................................................................................................ 69 
Hardware Upgrades ........................................................................................................................... 69 
Facial Recognition .............................................................................................................................. 69 
Speech Recognition ............................................................................................................................ 71 
Speaker Recognition ........................................................................................................................... 72 
Elevator .............................................................................................................................................. 73 
Service Panel .................................................................................................................................. 73 
Elevator Door Detection ................................................................................................................ 73 
Elevator Speech Recognition ..................................................................................................... 73 
Template Matching ................................................................................................................... 73 
Project Management .................................................................................................................................. 75 
SVN Repository ....................................................................................................................................... 75 
References .................................................................................................................................................. 76 
Appendices .................................................................................................................................................. 80 
Appendix A ............................................................................................................................................. 80 
Appendix B .............................................................................................................................................. 92 
Appendix C .............................................................................................................................................. 93 
Appendix – QRCodeDetection.m .......................................................................................................... 101 
Appendix – CreateTemplate.m ............................................................................................................. 103 
Appendix – RunComparison.m ............................................................................................................. 104 
Appendix – XYZTemp_Angle.m ............................................................................................................ 106 
Appendix – Xlocalization_Angle.m ....................................................................................................... 108 
Appendix – ObstacleDetection.m ......................................................................................................... 110 
Appendix – PotFieldDrive.m ................................................................................................................. 114 
Appendix – ButtonDetection.m ............................................................................................................ 117 
Appendix – ExitElevator.m ................................................................................................................... 120 
Appendix – Demo.m ............................................................................................................................. 121 
Appendix – QRLocal.m .......................................................................................................................... 125 
Appendix – Rlocal.m ............................................................................................................................. 127 
Appendix – Init.m ................................................................................................................................. 130 
Appendix – DepthTemp.m .................................................................................................................... 131 
Appendix – parsepoints.m .................................................................................................................... 132 
5
6 
List of Figures 
Figure 1 - OpenNI SDK Architecture [3] ...................................................................................................... 13 
Figure 2 - Occupancy Grid with obstacles [8] ............................................................................................. 18 
Figure 3 - Occupancy grid mapping code .................................................................................................... 19 
Figure 4 - Sample hallway map ................................................................................................................... 19 
Figure 5 - SetLoc code ................................................................................................................................. 20 
Figure 6 - SetGoal code ............................................................................................................................... 20 
Figure 7 - AdjustLoc code ............................................................................................................................ 21 
Figure 8 - FindDel code ............................................................................................................................... 21 
Figure 9 – Microsoft Kinect Subcomponents .............................................................................................. 22 
Figure 10 – ASUS Xtion Pro Specifications under OpenNI .......................................................................... 22 
Figure 11 - Harris Feature Extraction .......................................................................................................... 23 
Figure 12 - SURF Feature Extraction ........................................................................................................... 24 
Figure 13 - Sample Code ............................................................................................................................. 25 
Figure 14 - Matched Points (Including Outliers) ......................................................................................... 26 
Figure 15 - Matched Points (Excluding Outliers)......................................................................................... 26 
Figure 16. Localization "checkpoints" ......................................................................................................... 27 
Figure 17 - Overlayed images of a template versus current ....................................................................... 28 
Figure 18 - Flat wall assumption ................................................................................................................. 29 
Figure 19 - Unknown pose .......................................................................................................................... 30 
Figure 20 - Example of a template captured by the Xtion Pro.................................................................... 31 
Figure 21 - Code application of the functions ............................................................................................. 32 
Figure 22 - Example of Harris features extracted from an image ............................................................... 32 
Figure 23 - Code application of the functions ............................................................................................. 33 
Figure 24 - Result of identifying the Harris features and finding matching features in the images ........... 34 
Figure 25 - Visual representation of the relative pixel displacements used to find the function .............. 34 
Figure 26 - Mapped Depth Map of a Hallway ............................................................................................. 36 
Figure 27 - Respective Image Capture of Depth Capture ........................................................................... 36 
Figure 28 - Ground Plane Extraction ........................................................................................................... 37 
Figure 29 - Obstacle Extraction ................................................................................................................... 38 
Figure 30 - Colored Map with Obstacles ..................................................................................................... 38 
Figure 31 - Example of valid and invalid path from grid pathing [16] ........................................................ 39 
Figure 32 - Vector field analysis and path of movement from Potential Field method ............................. 40 
Figure 33 - Image Acquisition ..................................................................................................................... 42 
Figure 34 - Elevator Panel Template ........................................................................................................... 43 
Figure 35 - Template Matching with the Captured Image .......................................................................... 43
Figure 36 - Cropping Matrix ........................................................................................................................ 44 
Figure 37 - First Floor Crop ......................................................................................................................... 44 
Figure 38 - Black and White Conversion ..................................................................................................... 44 
Figure 39 - NNZ Function ............................................................................................................................ 44 
Figure 40 - Standard QR Code ..................................................................................................................... 46 
Figure 41 - (a) 10 character code vs (b) 100 character code ...................................................................... 47 
Figure 42 – a) 3D Creo Model of Robot System b) Final Proof of Concept Design ................................... 50 
Figure 43 – a) Creo Model of Housing b) Physical Model of Housing ...................................................... 51 
Figure 44 – a) Close up View of Web Camera b) Close up View of Xtion Pro and platform ..................... 52 
Figure 45 – a) Creo Model Close up of housing spacers b) Physical Model Close up of Housing Spacers 
.................................................................................................................................................................... 52 
Figure 46 - iRobot Create Coordinate System ............................................................................................ 53 
Figure 47 - Printed Degree Chart for Validation ......................................................................................... 58 
Figure 48 - Robot Localization Testing ........................................................................................................ 59 
Figure 49 - Object Avoidance Example ....................................................................................................... 60 
Figure 50 - Outline of facial recognition system [24] .................................................................................. 70 
Figure 51 - Rotating Elevator Bank Views ................................................................................................... 74 
Figure 52 - Flow Chart – Elevator to Lobby and Back ................................................................................. 88 
Figure 53 - Flow Chart - Elevator to Room .................................................................................................. 89 
Figure 54 - Fall Term Gantt Chart ................................................................................................................ 90 
Figure 55 – Winter Term Gantt Chart ......................................................................................................... 91 
Figure 56 - Detail Model Drawing ............................................................................................................... 92 
7
8 
List of Tables 
Table 1 - Project Needs ............................................................................................................................... 10 
Table 2 - Database Example ........................................................................................................................ 47 
Table 3 - Mass Properties of Model Assembly ............................................................................................ 53 
Table 4 - Final Parts List .............................................................................................................................. 54 
Table 5 - Overall Rotational Localization Data ............................................................................................ 61 
Table 6 - Overall Z Depth localization Data ................................................................................................. 61 
Table 7 - Overall X Displacement Localization Data .................................................................................... 62 
Table 8 - Overall Guest Recognition Data ................................................................................................... 63 
Table 9 - Overall Object Avoidance Data .................................................................................................... 63 
Table 10 - Parts List for Assembly ............................................................................................................... 80 
Table 11 - Budget Table .............................................................................................................................. 81 
Table 12 - Decision Matrix .......................................................................................................................... 82 
Table 13 - Project Needs (1 of 2) ................................................................................................................. 82 
Table 14 - Project Needs (2 of 2) ................................................................................................................. 83 
Table 15 - Specifications and Metrics (1 of 2) ............................................................................................. 85 
Table 16 - Specifications and Metrics (2 of 2) ............................................................................................. 86 
Table 17 - Event Breakdown ....................................................................................................................... 86 
Table 18 - Complete Z Depth Localization Trial Data .................................................................................. 93 
Table 19 - Complete X Displacement Trial Data (1/2)................................................................................. 94 
Table 20: Complete X Displacement Trial Data (2/2) .................................................................................. 94 
Table 21 - Complete X Displacement Trial Data (2/2)................................................................................. 95 
Table 22 - Complete Rotational Localization Trial Data (1/4) ..................................................................... 96 
Table 23 - Complete Rotational Localization Trial Data (2/4) ..................................................................... 97 
Table 24 - Complete Rotational Localization Trial Data (3/4) ..................................................................... 98 
Table 25 - Complete Rotational Localization Trial Data (4/4) ..................................................................... 99 
Table 26 - Complete Obstacle Avoidance Trial Data ................................................................................... 99 
Table 27 - Complete QR Code Reader Trial Data ...................................................................................... 100
9 
Introduction 
Many movies today regarding the future feature robots living, working, and interacting with humans on 
a daily basis. Often the programming and design of some of these robots have reached the level of near 
complete autonomy, as seen in the movie “iRobot”. Autonomous mobile robots are developed to 
perform tasks with little to no help from humans. This means that a robot will have to be able to 
complete tasks given to it without becoming stuck either physically or in a programming loop. The use of 
these robots has the potential to simplify and improve the quality of life for everyone on the planet. 
These types of robots are not just a far off dream. Current work in robotics is progressing to the point 
that robots play active roles in the current workforce. Robots are being used for manufacturing, 
healthcare, and service industries with plans for application in military and space programs [1]. Although 
the robots of today aren’t what one would expect when the term “robot” is used, the tasks performed 
are similar. 
There have been a growing number of instances where robots have been used to interact with humans 
over the years. The Smithsonian Museum of American History installed MINERVA mobile robots that 
acted as tour guides for certain exhibits. MINERVA was designed to interact with groups of guests and 
provide information on the current exhibits [2]. California hospitals are purchasing RP-VITA telepresence 
robots for doctors to be able to check on patients from home. The doctor connects to the robots 
monitor and can then automatically drive to rooms or take manual control of to get closer views. 
Amazon’s shipping warehouses utilize Kiva robots to transport shelves and boxes at the request of a 
human worker. The human will be at a command terminal giving prompts to the robot while they 
perform the tasks. All these robots have demonstrated the continuing development of the robotics field. 
A problem with some of the current market robots is the minimal human interaction with the robots 
while they carry out their tasks. The Kiva robots work very well in factory settings, however human 
interaction is kept to a command terminal basis. The telepresence robots are good for long distance 
communication of doctors and patients however many accounts say that it took some time to get used 
to having their doctor just be a screen in front of them. Additionally a nurse or assistant practitioner has 
to follow these robots in case extra interaction with the patients is needed. The MINERVA tour guides 
showed that robots could adapt and show a general understanding of simple emotion in a group. This 
robot did work well, but the outward design was on the functional side and not aesthetically pleasing. 
Now that the technology allowing robots to perform simple tasks is growing, development on the 
human interaction side is needed to make them more compatible in normal society. Responding to 
voice commands instead of computer prompts could be the precursor to simple speech. The design of 
an autonomous robotic guide in a building can clearly exemplify the steps necessary for successful 
human robot interaction.
10 
Stakeholders and Needs 
Needs are created from stakeholder input and are used to guide the project in the correct direction. The 
current stakeholders include Science Center MEM Lab members, visitors and building management, 
however as the project progresses, more stakeholders may become involved. Table 1 below shows four 
of the overarching needs relevant to the project. 
Table 1 - Project Needs 
Design Category Need 
S1 Safety Potential for injuries during normal operation mitigated 
Op1 Operation Minimize need for human input 
P1 Programming Robust guest detection 
P2 Programming Modular software components 
The needs listed above are all important because they are key driving factors that shape the design. 
Safety is of the utmost concern whenever humans interact with robots, with is addressed by S1. The 
goal of the project is to create an autonomous system, therefore minimizing the need for superfluous 
human input, which is represented by OP1. Guest verification is a need expressed directly by the 
stakeholders. P2 is a need expressed by team members to introduce modularity and increase operating 
efficiency. A full list of needs can be found at Table 13 and Table 14 in Appendix A.
11 
Problem Statement 
Certain secure buildings have workers take their minds off of their work to retrieve/sign in guests that 
arrive. This wastes precious time and effort that the workers could be putting towards finishing up 
projects. Also, many secure buildings now have multiple security checkpoints that further slow the 
process of picking up guests. If this task could be given to a robot instead, company workers could 
continue working until the guests are delivered to the designated area. 
The core mission of this senior design project is to design and build a proof of concept for a fully 
autonomous robotic guide/usher. The robot will have to be able to work its way through hallways, 
elevators, and lobbies while avoiding stationary obstacles and variably moving objects. Elevator controls 
will have to be operable through some means. The robot will also have to be able to recognize the 
appropriate guests that will be delivered the specific destination. There could be two or three different 
groups of guests waiting to be picked up for different departments in said building. Once delivery of 
guests occurs, the robot should move to a designated wait location for its next command.
12 
Methodology 
Background 
The robotic guide/usher in question will be an autonomous robot; comprised of perception, mobility, 
manipulation, and control architecture systems. The perception subsystem is used to distinguish and 
extract information from within the robot’s “visual” range. The mobility subsystem controls the robot’s 
locomotion in terms of heading and velocity. The manipulation subsystem in this particular case consists 
of the keycard system. The control architecture subsystem would be used as the central management 
and control console with regards to mobility and guest verification. These four subsystems in 
combination with each other form the structure that the autonomous robot relies on to complete the 
tasks assigned to it.
Perception Subsystem: Asus Xtion Pro 
Modern day machine vision systems employ multiple vision sensors working concurrently to capture 
depth, image, and audio for post-processing and general interpretation. The most basic suite of sensors 
utilized in order to enable vision for autonomous systems typically include a depth camera and a RGB 
(Red/Green/Blue) camera. Depth cameras come mainly in two different variants. Time-of-flight (ToF) 
depth cameras are based on utilizing the phase difference between the emitted and reflected IR signal 
in order to calculate the depth of a targeted object. Structured pattern depth cameras emit a structured 
IR pattern onto target objects and then utilize triangulation algorithms to calculate depth. Standard RGB 
cameras are used to provide color vision of the environment. Industrial grade depth cameras are often 
very expensive and cost anywhere from $3000 to $9000. RGB CMOS color cameras are an established 
and mature technology and therefore inexpensive. The integration of the Xtion Pro sensor to the system 
provides a cheap and established solution as a complete vision package. The Asus Xtion Pro is a motion 
sensing input device for use with the Windows PCs and is built from the same hardware as the Microsoft 
Kinect. From a developmental standpoint, utilizing the Asus Xtion Pro as the system’s vision suite is 
extremely cost efficient due to the fact that the Asus Xtion Pro contains an RGB camera, depth sensor, 
and a multi-array microphone all within one package. 
The integration of the Asus Xtion Pro is a complex and multilayered issue that involves a multitude of 
techniques, and modern engineering tools for successful integration. MATLAB is the desired Integrated 
Development Environment (IDE) because of staff and team familiarity. For integration of the Xtion Pro 
vision system, several software components need to be put in place to facilitate accurate data capture. 
Figure 1 - OpenNI SDK Architecture [3] 
13
In Figure 1 above, the Xtion Pro sensor is represented by the “Depth Physical Acquisition”. This step of 
the Software Development Kit (SDK) architecture constitutes the physical sensor’s ability to capture raw 
data. Next, the PrimeSense SoC operates the underlying hardware by performing functions such as 
depth acquisition dedicated calculations, matching depth and RGB images, down sampling, and various 
other operations. Then, the OpenNI framework takes over as the most popular open source SDK for use 
in the development of 3D sensing middleware libraries and applications. From there, a C++ wrapper was 
found which would allow the use of MATLAB as the primary IDE. 
The proper method for creating a data processing system for the Xtion Pro involves numerous steps. A 
Kinect MatLAB C++ wrapper was developed by Dirk-Jan Kroon from Focal Machine Vision en Optical 
Systems in January 31st, 2011. [4] This particular MATLAB C++ wrapper is utilized alongside OpenNI 
2.2.0, NITE 2.2.0, Microsoft Kinect SDK v1.7, and Microsoft Visual C++ compiler to create the functional 
data processing system. 
14
Mobility Subsystem: iRobot Create 
An important part of a mobile autonomous robot is the components that allow for locomotion. This 
motion system changes the system’s velocity and trajectory via a closed-loop control. There are two 
specific types of motion systems that need to be considered. Limb based locomotion and wheel based 
locomotion are the two most common types of motion systems. 
The iRobot Corporation, the makers of the iRobot Roomba, created the iRobot Create Programmable 
Robot as a mobile robot platform for educators, students, and developers as a part of their contribution 
and commitment to the Science, Technology, Engineering, and Math (STEM) education program. 
Utilizing the iRobot Create as the motion system is the most straightforward cost efficient solution. The 
iRobot Create features three wheels, a designated cargo bay, 6-32 mounting cavities, an omnidirectional 
IR receiver, sensors, a rechargeable 3000mAh battery, and a serial port for communication. By using this 
platform, the overall focus of the project can be turned toward developing new functionalities between 
the iRobot Create and the Asus Xtion Pro without having to worry about mechanical robustness or low-level 
15 
control. 
The iRobot Create uses an Open Interface (OI) comprised of an electronic interface as well as a software 
interface for programmatically controlling the Create’s behavior and data collection capabilities. The 
Create communicates at 57600 baud via a numeric command system. For example, the command code 
that commands the Create to drive in reverse at a velocity of -200mm/s while turning at a radius of 
500mm is [137] [255] [56] [1] [244] [5]. It becomes clear that this command code structure is unintuitive 
and unwieldy. In order to resolve and simplify this issue, MATLAB can be used to better facilitate 
communication between a laptop and the iRobot Create via a RS-232 to USB converter. MATLAB is the 
desired Integrated Development Environment (IDE) due to staff and team familiarity. Therefore, 
MATLAB will be used to issue commands to alter the robot’s heading and velocity. The MATLAB Toolbox 
for the iRobot Create (MTIC) replaces the native low-level numerical drive commands embedded within 
the iRobot Create with high level MATLAB command functions that act as a “wrapper” between MATLAB 
and the Create [6].
Manipulation Subsystem: Keycard 
The Science Center building is a secure facility. Therefore in order to operate the elevators, a keycard 
must be swiped over a reader to grant access. A human being will always be accompanying the robot 
when it enters the elevator, eliminating the need for a complex swiping mechanism. Instead, the robot 
will utilize a heavy duty retractable key reel system. When entering the elevator, the robot will prompt 
the human via an audio cue, to swipe and press the corresponding elevator button. The human 
accompanying the robot will be able to pull the card to the proper location to swipe, then release the 
card, which will retract back into its starting location. In order to prevent possible theft, the reel will use 
heavy duty Kevlar wire fixed to robot. This design be the fastest method to swipe, and will prevent 
delays for any other people in the elevator. 
16
Control Architecture Subsystem 
17 
Mapping 
Background 
Mapping is a fundamental issue for a mobile robot moving around in space. A map is relied on for 
localization, path planning, and obstacle avoidance. A robot needs a map when performing localization 
in order to give itself a location on that map. Otherwise, the robot could be at any location in an infinite 
space. According to DeSouza and Kak’s article on vision-based robot navigation systems, three broad 
groups in which indoor vision-based robot navigation systems can be categorized into are map-based 
navigation, map-building-based navigation, and map-less navigation. Map-based navigation systems are 
reliant on geometric modeling to element modeling of the environment, which the navigation system 
can rely on. Map-building navigation systems utilize the onboard sensors to actively generate the 
modeling of the environment for use with active navigation. Map-less navigation systems are typically 
more sophisticated systems based on recognizing objects found in the environment in order to orient 
and continue along their path. 
Several challenges arise when mapping. To start, the area of all possible maps is infinitely large. Even 
when using a simple grid there can be a multitude of different variables. This can lead to very large and 
complex maps. Noise in the robot’s sensors also poses a problem. As a robot is navigating around errors 
accumulate, which can make it difficult to obtain an accurate position. Also, according to the 
Probabilistic Robotics textbook [7], when different places look very similar, for instance in an office 
hallway, it can be very difficult to differentiate between places that have already been traveled at 
different points in time, otherwise known as perceptual ambiguity. 
Challenges aside, depending on the type of navigation system chosen, the complexity and sophistication 
of sensors needed varies. 3-dimensional mapping can be more complicated and is not always needed 
depending on the application. Instead a 2-dimensional map or occupancy grid can be created. “The basic 
idea of the occupancy grids is to represent the map as a field of random variables, arranged in an evenly 
spaced grid [7]“. The random variables are binary and show if each specific location is occupied or 
empty. 
The main equation that makes up occupancy grid mapping calculates the posterior over maps with 
available data, as seen in Equation 1. 
푝(푚 |푧1:푡 , 푥1:푡 ) 
Equation 1
Here, m is the overall map, z1:t is the set of all measurements taken up to time t, and x1:t is the set of all 
the robots poses over time t. The set of all poses make up the actual path the robot takes. 
The occupancy grid map, m, is broken up into a finite number of grid cells, mi as seen in Equation 2. 
푚 = Σ푚푖 
푖 
Equation 2 
Each grid cell has a probability of occupation assigned to it: p(mi=1) or p(mi=0). A value of 1 corresponds 
to occupied, and 0 corresponds to free space. When the robot knows its pose and location on the 
occupancy grid, it is able to navigate due to the fact that it knows where obstacles are since they are 
marked as a 1. Figure 2 shows a basic occupancy grid with a robot located at the red point, the white 
points are empty space, and the black spaces are walls or obstacles. 
Figure 2 - Occupancy Grid with obstacles [8] 
The obstacles on the occupancy grid are considered to be static only. These obstacles are considered to 
be stationary for the particular iteration that they are being used. Dynamic obstacles are considered to 
be within this classification; however the robustness of the algorithm to calculate the optimal path 
around these obstacles needs further investigation. 
Process 
In order to implement the 2D occupancy grid into MATLAB, a simple 2D array is needed. As an example a 
simple straight hallway can be mapped. The empty space is made up of a unit 8 variable type array of 
0’s, while the walls are made up of an array of 1’s. The map, m, is comprised of both the space and the 
walls. Example code for a simple map can be seen in Figure 3. 
18
Figure 3 - Occupancy grid mapping code 
In this basic code, an empty hallway is mapped using the basis of a 2D occupancy grid as described 
previously. This hallway is about 5.5 feet (170cm) by about 10 feet (310cm). Here the walls are 15cm 
thick, which was just chosen to give the walls some thickness. The unit8 variable type is used to 
minimize memory usage while running the code. This is important since a much larger occupancy grid 
will have many more data points. A basic example of what the occupancy grid of zeros and ones would 
look like for a straight hallway can be seen in Figure 4. 
1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 
1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 
1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 
1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 
1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 
1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 
1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 
1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 
1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 
1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 
Figure 4 - Sample hallway map 
Now that the basic example of the occupancy grid is defined, further steps can be taken to convert a 
fully 3D space into a functional 2D occupancy grid. This occupancy grid can be used to keep track of 
goals, positions, and obstacles. In order to clean up the syntax for map operations, several functions 
must be defined. 
The first function is the setLoc.m function. The code for this is shown on the next page. 
19
Figure 5 - SetLoc code 
The SetLoc function accepts three inputs and outputs the modified m variable. The first input is the 
initial map variable, the second input is the value of the y location in centimeters, and finally the last 
input is the value of the x location in centimeters. Next, the function locations the last known x and y 
position of the robot and resets this location to 0. And finally, it uses the two input location variables to 
set the new position variable before updating the m output variable. This function allows for efficient 
and clean redefinition of the robot’s position on the map. 
The next function is the setGoal.m function. The code for this particular function is shown below. 
Figure 6 - SetGoal code 
The SetGoal function accepts three inputs and outputs the modified m variable. The first input is the 
initial map variable, the second input is the value of the y goal location in centimeters, and finally the 
last input is the value of the x goal location in centimeters. Similar to the SetLoc function, the function’s 
next step is to locate the previous goal location and to reset this location’s occupancy grid value to 0. 
And finally, it redefines the new goal position’s occupancy grid value to 3. This designates the goal. 
20
The next function is the AdjustLoc function. The code for this particular function is shown below. 
Figure 7 - AdjustLoc code 
The AdjustLoc function again accepts three inputs and outputs the modified m variable. The first input is 
the initial map variable to be adjusted, the second input is the centimeter value of adjustment of the y 
position, and the third input is the centimeter value of adjustment of the x position. Similar to the 
SetLoc function, the function’s next step is to locate the previous goal location and then to reset the 
previous location’s occupancy grid value to 0. Finally, it calculates the new position by summing the old 
position with the adjustments specified in the inputs. This allows for adjustment of the position of the 
robot due to localization algorithms that will be defined later on. 
The last function for performing operations on the map is the FindDel.m function. The code for this 
particular function is shown below. 
Figure 8 - FindDel code 
The FindDel function simply calculates the distance in Y and X between the goal and the current position 
of the robot. It then converts this distance from centimeters to millimeters so that the input to the 
Roomba is simplified. 
21
22 
Localization 
Theory 
Background 
“Mobile robot localization is the problem of determining the pose of a robot relative to a given map of 
the environment [9]”. Otherwise known as position tracking or position estimation, the problem of 
determining the position and pose of a robot is a fundamental problem that needs to be solved before 
any level of autonomy can be claimed. When given a map of the surrounding environment, the robot 
must be able to traverse the environment to the desired goal. However, without accurate and timely 
position data, the robot can quickly accumulate error leading to undesirable actions. 
However, there are many steps before successful self-localization can be accomplished. Because the 
Asus Xtion Pro has been selected as the platform from which the data from the surrounding 
environment will be collected, there are immediate parameters and values from which things can be 
worked around are needed. 
Figure 9 – Microsoft Kinect Subcomponents 
Figure 10 – ASUS Xtion Pro Specifications under OpenNI
Utilizing the IR Depth Sensor, the IR Emitter, and the Color Sensor along with their corresponding 
components, the data captured by the various sensors can be used. It is important to mention that both 
position and pose need to be approximated using the information derived from the sensors. As such, the 
next two sections will detail the various techniques used to calculate this information with respect to 
the global coordinate system. However, before position and pose can be solved for, the basic algorithms 
that facilitate this capability need to be detailed first. It is important to note that the ASUS Xtion Pro is 
the developer’s version of the Microsoft Kinect. It is a compact version with the exact same hardware. 
Edge/Corner Detection 
One of the feature extraction methods utilized is the corner detection algorithm named Harris & 
Stephens’s method. This particular algorithm is a further improvement on Moravec’s corner detection 
algorithm introduced in 1977. Moravec’s algorithm operates on detecting the intensity contrast from 
pixel to pixel within a black and white image. By taking the sum of squared differences (SSD) of the 
intensity from pixel to pixel, patches with a large SSD denote regions of interest whereas patches with a 
small SSD denote regions of minimal interest. Harris & Stephens take this concept a step further by 
taking the differential of the intensity score with respect to direction into account. The general form of 
this weighted SSD approach can be found in Equation 3, where an image region (퐼) of area (푢, 푣) is 
shifted by (푥, 푦), a value representing the weighted SSD is denoted by 푆(푥, 푦). 
푆(푥, 푦) = ΣΣ푤(푢, 푣)(퐼(푢 + 푥, 푣 + 푦) − 퐼(푢, 푣))2 
23 
푢 푣 
Equation 3 
Harris & Stephens method’s implementation is built into MATLAB’s Computer Vision Toolbox and the 
results are shown in Figure 11. 
Figure 11 - Harris Feature Extraction
Blob Detection 
Blob detection methods are algorithms that are used in computer vision systems to detect and 
subsequently describe local features in images. These detection algorithms are very complex ways of 
detecting points of interest. These points of interest are then filtered for stability and reliability based on 
their various properties. A robust local feature detector named SURF was first presented by Herbert Bay 
in 2006. SURF (Speeded-Up Robust Features) is a feature extraction algorithm that was inspired by SIFT 
(Scale-invariant feature transform). Although SURF is the algorithm that ultimately was utilized, it is 
important to describe the concept behind SIFT which SIFT uses to extract important features from an 
image. SIFT is an algorithm published by David Lowe. Lowe's method is based on the principle that 
distinct features can be found in areas of an image that are located on high contrast regions known as 
an edge. However, beyond edge detection the SIFT method also accounts for the relative positions of 
these detected points of interest. The SURF method’s implementation is built into MATLAB’s Computer 
Vision Toolbox and the results are shown in Figure 12. [10] [11] [12] [13] [9] 
Figure 12 - SURF Feature Extraction 
24
Translational (XYZ) 
Background 
Self-localization is the act of determining the position and pose of a mobile body relative to a reference 
map. Self-localization is a very important component in almost every function of an autonomous robot. 
As previously mentioned, both the position and the pose of a mobile body are important pieces of 
information to track. For ideal organizational division of labor, this section will detail the methods in 
which the problem of self-localization with respect to placement is solved. 
By utilizing the Asus Xtion Pro, measurements of the local environment can be made and used toward 
solving this problem. However, pure measurements without any distinctive properties do not serve 
much use when a translation or rotation relative to the map is performed. As such, effort must be taken 
to withdraw distinctive features of the surrounding environment. A multitude of computer vision 
systems rely on corner detection or feature extraction algorithms to detect various points of interest 
that can be associated to the environment rather than just an individual image. 
Object Detection 
Either of the aforementioned feature extraction methods can be used to quickly ascertain the similarity 
of each image based their features. Once the strongest feature points from each image are gathered, 
they are compared with each other using the following code. 
Figure 13 - Sample Code 
The strongest feature points from each image are matched for similarities. The feature points with 
matching distinct identical features can be plotted. An example of this can be seen in Figure 14. 
25
Figure 14 - Matched Points (Including Outliers) 
The next step is to eliminate any erroneous data points from misaligning or skewing of the object by 
conducting a geometric transformation relating the matched points. The final result is shown below. 
Figure 15 - Matched Points (Excluding Outliers) 
Once the erroneous data points have been eliminated, the pixel location of the object in the scene can 
be approximated and finally data can be taken. By using OpenNI1, the exact pixel location can be used 
calculate the x, y, and z coordinates of the object. The OpenNI framework utilizes two different 
coordinate systems to specify depth. The first coordinate system is called Depth. This coordinate system 
is the native coordinate system as the X, and Y values represent the pixel location relative to the frame 
of the camera. The Z value in the depth coordinate system represents the depth between the camera 
plane and the object. Using a few functions, OpenNI allows for the ability to quickly translate the Depth 
coordinate system to the Real World coordinate system. Here, the x, y, and z values represent the more 
familiar 3D Cartesian coordinate system. Where the camera lens is the origin, and the x, y, and z values 
represent the distance in those dimension the object is away from the origin. With the knowledge that 
these XYZ real world coordinates are from the camera’s location, the relative location of the camera in 
accordance with the global map can be calculated. If the object’s placement is previously given in the 
global map; the raw difference in X and Z can be used to calculate the actual location of the camera 
compared with the approximate location. [10] [11] [13] 
26
Implementation 
This method is not infallible by any means; the object matching utilizing a template requires a very high 
resolution camera. While the Xtion Pro has a decent RGB camera, the effective range of the RGB camera 
is about one meter. Therefore, when selecting landmarks for localization, care must be taken to select 
large landmarks with distinctive features that are unique to that specific landmark. For example, the use 
of a door would prove to be inadequate due to the fact that a door would typically not have distinctive 
features when compared with another door in the environment. 
Because of the aforementioned difficulties, the usage of QR Codes alongside the concept of localization 
“checkpoints” were used to perform positional localization with the degree of accuracy that is desired. 
Figure 16. Localization "checkpoints" 
These QR Codes serve two different purposes. By utilizing unique QR Codes at every checkpoint, the 
difficulty of distinguishing different checkpoints from one another is eliminated. The next goal of using 
QR Codes is to provide a distinct object from which measurements can be made. 
The localization file used to obtain a fix on the relative position of the robot with respect to the map is 
named Appendix. This localization subroutine enables the ability for the robot to obtain an accurate x 
and y dimension reading with respect to the map. Because the QR codes are strategically placed ninety 
degrees to the ideal path toward the goal, the x position of the robot can be calculated and corrected 
simply by measuring the depth of the wall that the QR code is posted on. If an x position of 80cm is 
desired, the localization subroutine will simply iterate the logic until a depth of 80cm is recorded. The 
27
next objective of the localization subroutine is to obtain the y position of the robot with respect to the 
map. Before navigation can be conducted, the system must have a template image associated to that 
specific QR code and localization checkpoint to use as a reference. 
Figure 17 - Overlayed images of a template versus current 
The image above is an example of utilizing the template image to compare the current position of the 
robot with the ideal position of the robot. The red circles represent feature points detected by the SURF 
method. These are representative of the feature points that will be tracked. Next, the green plus marks 
are those same feature points but at a different location. The translation can be calculated directly. This 
allows for calculation of the y position by comparing the template image and the current image and 
leveraging the QR code as a point of reference. Templates are created using Appendix Code. 
28
Bearing 
Overview 
The localization subroutine used to determine the pose of the robot has undergone a multitude of 
changes over the course of the project. The two main methods that emerged from the development 
phase that became the two primary methods of determining pose. The first method utilized a template 
matching concept in order to match the exact viewing angle that the template was created within. The 
next method involved using simple trigonometry as well as the fact that each localization checkpoint 
directly faces a large flat wall. 
Flat Wall Assumption 
The simplest and most consistent method of determining pose involves assuming that every localization 
checkpoint can be placed within a section of the wall that has a wide and flat surface. 
Figure 18 - Flat wall assumption 
The green block represents the wall, the white space represents the open space, the blue grid 
represents the map grid, the red circle represents the robot, the arrow indicates our current 
heading/pose, and finally the dotted line represents the plane from which the Xtion Pro takes 
measurements. Figure 18 assumes that our heading is exactly facing the wall, and the dotted line which 
represents the measurement plane and the wall are exactly parallel. However, this assumption cannot 
be immediately made with some initial calculations and measurements. 
Let the pose scenario for the robot be presented in the figure below. The measurement plane is skewed 
an unknown angle, and the pose of the robot is now unknown. As such, the correction angle is also 
unknown. The correction angle can actually easily be solved for with some simple trigonometry. 
29
Figure 19 - Unknown pose 
Measurements are taken from the measurement angle for depth directly to the green wall. If the robot’s 
measurement plane is truly parallel to the wall, both measurements will be equal. Otherwise, the 
difference between the two measurements can be utilized to calculate the correction angle needed to 
adjust the pose to the desired state. 
ΔY 
Δ푋 
푡푎푛−1 ( 
30 
) = 휃 
Equation 4. Correction Angle 
The code listed in Appendix uses this exact methodology to calculate for the correction angle if needed. 
Template Method 
In order to properly localize itself, the system must have a template image to use as a reference. This 
reference image should be a photo taken from a position where the robot would be ideally lined up. 
Whether it is aimed at a specific landmark or just down a hallway, it needs to be an image of its 
destination orientation. To optimize the likelihood of success of the system, an image with some activity 
should be used, activity meaning items hung up on walls or doors; something that is not just a plain 
hallway. Figure 20 below shows a strong example image of a template.
Figure 20 - Example of a template captured by the Xtion Pro 
While technically an image from any camera could be used, given it shoots in the same resolution, 
ideally the rgb camera on the Xtion should be used because the comparison images will be shot from it. 
To facilitate the creation of a template image, the script shown in Appendix – QRCodeDetection.m, is 
used to initialize the camera, snap a picture and then store the outputted image in the folder. This 
function allows for templates to be created easily, which is beneficial for testing, but as well for if the 
environment in which the robot is functioning, changes. 
Template Method - Processing 
For the system to orient itself, it compares the current image the camera is seeing with the precaptured 
template image. The script analyzes the two images finds a correlation between the two and then 
rotates the system respectively. If the script finds no relation between the two pictures, it rotates 58° 
(the field of view of the camera), to give itself an entirely new image to try. Once it has a relation 
between the two, the script repeats itself until the system is lined up with the template. 
Thanks to MATLAB’s Computer Vision Toolbox, the script is fairly straightforward. The script relies 
heavily on the toolbox’s built in function’s corner point finding abilities. While the toolbox comes with a 
variety of different feature detection functions including “BRISKFeatures”, “FASTFeatures”, and 
“SURFFeatures”; the chosen method for this project was “HARRISFeatures” which utilized the Harris- 
Stephens algorithm. This method was chosen because of its consistent positive performance in testing. 
The “detectHARRISFeatures()” function takes a 2D grayscale image as an input and outputs a 
cornerPoints object, ‘I’, which contains information about the feature points detected. Not all of the 
points found by the Harris-Stephens algorithm however and further processing needs to be done to find 
viable points. The “extractFeatures()” function is then used and takes an input of the image as well as 
31
the detected cornerPoints to find valid identification points. The function takes a point (from ‘I’) and its 
location on the image and examines the pixels around the selected point. The function than outputs 
both a set of ‘features’ and ‘validPoints’. The ‘features’ object consists of descriptors, or information 
used that sets the points apart from other points. ‘validPoints’ is an object that houses the locations of 
those relevant points; it also removes points that are too close to the edge of the images for which 
proper descriptors cannot be found [14]. Figure 21 below shows how the code is executed whileFigure 
22 Figure 22 shows the results of using these functions to identify Harris features. 
Figure 21 - Code application of the functions 
Figure 22 - Example of Harris features extracted from an image 
32
This process is run for both the template image and the comparison image, which brings about the 
importance of using the same camera for both images. Different cameras, although same resolution, can 
capture an image in different color, making the feature matching process more difficult. With the two 
images processed, they must then be compared. Using the built in function, “matchFeatures()”, the two 
sets of features can be compared to and determined as to whether or not they represent the same 
feature, but from a different angle. The output consists of an index of points that contain features most 
likely to correspond between the two input feature sets [15]. Plugging the locations of those specific 
points back into the validPoints allows for an object consisting solely of the coordinates of matched 
pairs. Figure 23 shows the applications of the functions in the code while Figure 24 shows the result of 
comparing the Harris features on the two images. From the image, it is clear to see the translations of 
the different points from template to comparison, as represented by the line connecting the two 
different datasets. This translation is crucial to the script because that is the key to determining the 
amount the robot must rotate to properly line up. 
Figure 23 - Code application of the functions 
33
Figure 24 - Result of identifying the Harris features and finding matching features in the images 
The pixel difference between a specific set of points is relative to the angle the robot must rotate to line 
up. Because there is usually some discrepancies when it comes to comparing all the different pixel 
displacements, an average, ‘d”, is taken of the lot to get a better overall displacement. A multiplier is 
created, using the average displacement over the horizontal resolution as a ratio. That ratio is multiplied 
by the sensors field of view to get the necessary turn angle, as represented by θ in Equation 5. 
Figure 25 - Visual representation of the relative pixel displacements used to find the function 
34
35 
푑 
푥푟푒푠 
∗ 퐹푂푉 = 
푑 
640 
∗ 58 = 휃 
Equation 5 
The robot is then turned the calculated value of θ degrees to try and line up in the right direction. The 
entire process is set in a loop to repeat until the robot is as close to lined up as can be. This is made 
possible by utilizing the average displacement data from between the valid points. The code is set to 
repeat itself, capturing a new comparison image, finding features and comparing them, until the average 
difference between points is below a certain number. That chosen number is 20, which relates to an 
angle of 1.82 degrees. Most tests, with the upper bound set to 20, would result in an average 
displacement much less than that actually being the result of the rotation. Testing also showed that 
when the chosen number was lower, the system would try and make it into the bounds and wind up 
oscillating until it was no longer remotely lined up. The complete code used to perform the comparison 
can be found in Appendix – CreateTemplate.m at the end of the paper.
36 
Obstacle Detection 
Obstacle detection is extremely important because without it, the system cannot be considered 
autonomous. It allows the system to differentiate between what is a clear and travelable location and 
what is obstructed and needs to be avoided, on the fly. Once the obstacles are detected, they are able 
to be transformed to the map, allowing the robot to move safely and freely. 
Process 
Before any scripts can be run, a depth image is first captured of the system’s current field of view. The 
image is stored as a 3D matrix that holds real world XYZ coordinate data, for each pixel of the camera’s 
resolution, i.e. a 640 x 480 x 3 matrix. The XYZ data represents the length, height and depth for each 
pixel, in millimeter units, hence the 3D matrix. Figure 26 below shows a graphical representation of the 
depth map; where every pixel that is not black, has XYZ data. Figure 27 provides the respective visual 
image of the depth capture for clarification purposes. It is important to note that the smooth reflective 
tile floor caused issues with depth collection, shown by the close black pixels. 
Figure 26 - Mapped Depth Map of a Hallway 
Figure 27 - Respective Image Capture of Depth Capture
Once the depth image is captured, a ground plane extraction can be run on the image. Due to the low 
resolution of the camera, only the bottom 100 pixels of the image were analyzed; anything above those 
pixels was too inaccurate and caused issues with extraction. 
The ground plane extraction runs on the principle that each step taken down a hallway takes you further 
than where you started. This means that each point on the floor should have a point behind it, further 
away. To do this, the depth portion (Z) of the depth map is analyzed and each cell is examined. If the 
cells vertical neighbor is a larger number, meaning further from the camera, it is not considered an 
obstacle. Any cell that is not considered an obstacle, has its location saved to a new matrix. Plotting that 
matrix on top of the original image produces Figure 28 below. 
Figure 28 - Ground Plane Extraction 
The lack of pixels in certain areas is again due to a combination of having a low resolution camera as well 
as the material that the floor is made of. However, it is still easy to see in the figure, where the ground is 
clear; marked by the green pixels. The pixel selection is then inverted, causing the unselected to now 
become selected and vice versa. This is done in order to now highlight and detect what is an obstacle. 
In order to try and increase the green pixel density, the data is smoothed using MATLAB’s built in 
smoothing function. Further manual smoothing is then performed. Any column of the image that only 
has one pixel is removed, as it doesn’t provide enough data. Each column that has two or more green 
pixels, is filled with green from the columns minimum value to its highest. This manually removes all the 
small missing pixel areas. Lastly, if a column of green has no green columns on either side of it, then it is 
also removed because it is more inaccurate data. With the ground plane extracted, all of the opposite 
pixels are selected and made into an “obstacle” array. Figure 29 below shows the results of the obstacle 
extraction. Green means that the area obstructed, while no marking means the path is clear. 
37
Figure 29 - Obstacle Extraction 
Each marked pixel’s locations (on the image) are updated in the obstacle array. Because the obstacle 
array was created to overlay on top of the image, the marked pixel locations are directly related to XYZ 
data. The real world location for each pixel is then extracted from the depth image, which is then 
written to the already existing map. Because it is a top down two dimensional map the length values (X) 
are used as the horizontal distances and the depth values (Z) are used as the vertical distance. All the 
values are written from the origin, which is wherever in the map the robot currently is. The process of 
writing values to the map consists of assigning the specific cell, determined from the coordinates, a 
value of 4, indicating at that location there is an obstacle. Figure 30 below shows a zoomed out matrix 
image of the map after the obstacles have been added. 
Figure 30 - Colored Map with Obstacles 
Because of the zoom, the values of zero are not shown. Anything visible has a value of 1; the black lines 
are the walls, red are the detected walls and green indicates the trashcan. The origin of the figure, 
where the points are measured from, is the bottom center of the image. The full code can be found in 
Appendix – ObstacleDetection.m. 
38
39 
Obstacle Avoidance 
Theory 
Robotic pathing, or motion planning, is the ability of a robot to compute its desired movement 
throughout the known area by breaking down the movement into smaller steps. The path is calculated 
such that the robot could complete travel to its end destination in one large motion. Starting position 
and heading of the robot are assumed to be known from either specific placement of the robot or 
localization algorithms. A map is imported into the system showing the end destination of the intended 
motion along with all known obstacles within the map area. A specific pathing algorithm is then chosen, 
using the map as the basis of the path calculation. 
Grid-Base Search pathing sees a similar functionality to the occupancy grid in that the map is broken up 
into grid that is represented as being filled with either free or occupied points. The robot is allowed to 
move to adjacent grid points as long as the cell has a value of free associated with it. Constant checking 
from a collision avoidance program is needed for this type of pathing to make sure the grid points along 
the planned path are truly free. One consideration with this type of pathing is the size of the grid 
overlaid on the map. An adequate resolution has to be chosen depending on the availability of the area 
of motion as free. Courser grids with cells representing a larger area offer quicker computational time 
but sacrifice precision. A cell that is deemed to be free might actually be occupied in 40 percent of the 
cell area. This could lead to potential collision in narrow sections of the map. The other side of the coin 
has a fine grid with more precise cell values. More time is needed to complete the path before 
operation could start. 
Figure 31 - Example of valid and invalid path from grid pathing [16]
Potential Fields method sees the use of pseudo-charges to navigate to the destination. The end 
destination of the path is given a positive, or opposite, charge that the robot would like to move to. 
Obstacles along the robots motion toward the goal will assigned negative, or like, charges. The path is 
then determined as the trajectory vector from the addition of all the charges. The potential field method 
has advantages that the trajectory is a simple and quick calculation. The problem lies in becoming 
trapped in the local minima of the potential field and being unable to find a path. Local minima can be 
planned and adjusted for by a number of methods. Adding small amount of noise into the system will 
bump out of the minima area and back into useable calculation. Obstacles could also be given a 
tangential field to again move the robot out of the minima to recalculate the path [17] . 
Figure 32 - Vector field analysis and path of movement from Potential Field method 
Execution 
As seen in the Obstacle Detection section, several subroutines allow for the robot to map the 
unforeseen obstacles within its path onto the map. These obstacles are represented by numerical 1s. As 
previously described in the Theory section of Obstacle Avoidance, there are a multitude of techniques 
can that be utilized to dynamically calculate the path to avoid these obstacles. The method that was 
chosen for this particular project is called the Potential Field method. Adopting behaviors from nature, 
obstacles and goals are modeled with pseudo-charges and the contribution of each pseudo-charge is 
calculated and the resulting velocity vector is used to drive the robot. 
As described in Michael A. Goodrich’s tutorial for the implementation of potential fields method, the 
functional behavior of goals and obstacles within the operating environment of the robot are carefully 
defined with a few parameters. For 2D navigation, (푥푔, 푦푔) define the position of the goal, and (x, 푦) 
define the position of the robot. The variable r will define the radius of the goal. The direct distance as 
the angle between the goal and the robot is calculated. By utilizing a set of if statements various 
40
behaviors can be defined with respect to the operating boundaries of the robot and its interaction with 
the goal. 
41 
{ 
Δ푥 = Δ푦 = 0 
Δ푥 = 훼(푑 − 푟) cos(휃) , Δ푦 = 훼(푑 − 푟)sin (휃) 
Δ푥 = 훼 cos(휃) , Δ푦 = 훼 sin(휃) 
Where 훼 is the scaling factor of the goal seeking behavior, d is the distance between the robot and the 
goal, r is the radius of the goal, and theta is the angle between the goal and the robot. The first equation 
is used if the distance between the goal and the robot is less than the radius of the goal. This defines the 
behavior of the robot when the goal has been reached. The second equation defines the behavior of the 
robot when the radius of the goal is less than or equal to the distance between the goal of the robot and 
that is less than the influence field of the goal plus the radius of the goal. This condition is met when the 
robot is nearing the goal, and the equation will scale the velocity vector that the robot experiences until 
the goal is reached. And finally, the last equation defines the behavior of the robot when it is nowhere 
near the goal. 
The obstacles are defined in the exact opposite manner. Both the distance and the angle between the 
robot and the obstacle are calculated and used in the following equations. These equations are 
contained within a set of if statements. These if statements define the behavior of the robot in various 
scenarios involving these obstacles. 
{ 
Δ푥 = −(cos(휃))∞, Δ푦 = −(sin(휃))∞ 
Δ푥 = −훽(푠 + 푟 − 푑) cos(θ) , Δy = −β(s + r − d)sin (θ) 
Δ푥 = Δ푦 = 0 
Where 훽 is the scaling factor of the obstacle avoiding behavior, d is the distance between the robot and 
the goal, r is the radius of the goal, and theta is the angle between the goal and the robot. The first 
equation is used if the robot’s position and the obstacle’s position are infinitely small. As such, the 
robot’s behavior will be to move away from the obstacle at very large velocity vector. Again, the second 
equation is to scale the velocity vector of the robot as it approaches the obstacle, and finally the robot 
experiences no contributions to its velocity vector if it is nowhere near the obstacle. 
The overall velocity vector is a summation of the contributions of the attractive and repulsive forces 
exerted by the goals and obstacles within the robot’s operating environment. This exact method is 
utilized can be found in Appendix – PotFieldDrive.m.
42 
Elevator Navigation 
The process behind the elevator navigating component involves a modified and extended template 
matching system. The logic behind the system is that if it can tell that it’s in front of the panel, it can 
discern between the different buttons and therefore monitor which are pressed are which are not. The 
system first scans to capture the current panel and determines whether or not the desired floor button 
is pressed. If the button is not pressed, it prompts the user to press it and will continue scanning until 
the button is detected as pressed. The exact detection process will be detailed in the following section. 
With the button verified as pressed, the system then checks constantly until it sees that the button is no 
longer pressed. This signifies that the correct floor has been reached and the robot will proceed to exit 
the elevator. 
Process 
The first step in the process is capturing a template image. This process is only done once and can be 
applied to all of the elevators that use the same panel. This image is captured using the Image 
Acquisition Toolbox with the webcam selected. Figure 33 below shows how the call is made to the 
webcam. The color space is set to gray scale to simplify the template matching process as well as the 
frames per trigger set to one to make it so the program only captures one image every time it’s called. 
Figure 33 - Image Acquisition 
The template is created and cropped to feature only the actual panel itself, as shown in Figure 34. This 
template image is stored and is used for every run; it does not change.
Figure 34 - Elevator Panel Template 
With the template captured, the process is able to run. With the robot in place facing the panel, it 
begins to capture new images. Each image is matched against the template image and compared to find 
similar points, using the same method as the localization. However in order to analyze the current 
image, a geometric transformation is performed. This automatically scales and crops the captured image 
to match that of the template [18]. Figure 35 shows MATLAB actually matching the different images. 
Figure 35 - Template Matching with the Captured Image 
This is done so that the individual buttons can be analyzed. The image is sectioned off into different 
rectangles, one for each button. This is done by creating a matrix of coordinates consisting of the pixel 
locations of rectangles that encompass each button. Doing this allows the function to look for a certain 
flow, and know how to crop the image accordingly. The cropping matrix for this elevator panel is shown 
in Figure 36 and only works on this template. 
43
Figure 36 - Cropping Matrix 
Setting the desired floor to one, so that the system gets off at the lobby, causes the program to crop the 
image like as shown in Figure 37. 
Figure 37 - First Floor Crop 
With the image cropped correctly, the actually button needs to be analyzed to be determined whether 
or not it is pressed. In order to do this, the image is converted to strictly black and white. The 
illuminated button creates a large amount of white pixels as shown in Figure 38 , which are nonexistent 
when the button is not pressed. 
Figure 38 - Black and White Conversion 
With the white pixels showing, it is easy to count the amount and use that number to determine 
whether or not the button is pressed. This is done by using the number of non-zero (nnz) function built 
into MATLAB as demonstrated in Figure 39. The nnz function analyzes a matrix, in this case the black and 
white image, and counts the number of elements that are non-zero, in this case white pixels [19]. If that 
number is large enough, it tells the program that the button is pressed and continues running. 
Figure 39 - NNZ Function 
44
This process is nested within itself and after running and determining that the button is pressed, it 
repeats itself. It uses the same exact process to determine when the button is no longer pressed, 
signaling that it is time to exit the elevator. The full code can be found in Appendix – ButtonDetection.m. 
Exit Elevator 
Part of navigating the elevator involves being able to physically leave. The system is set to drive out 
using prewritten commands. Originally, depth measurements were going to be used, however other 
visitors in the elevator threw off the ability to find the walls and therefore the only way to constantly 
move was with scripted steps. The full code can be found in 
45
46 
Subject Identification 
The use of a QR code to determine a visitor’s identity is the current choice for verification. The use of QR 
code was chosen for multiple reasons. Because it is basically a barcode, QR codes are very simple to 
generate, especially with the help of different internet sites. Now that most people carry around 
smartphones that can decode the codes with the use of an app, they are becoming a more and more 
popular method to share information. Figure 40 below shows an example of a standard code that is 
used. 
Figure 40 - Standard QR Code 
A code, in this case a static id, can be inputted and a high resolution image of the code can quickly be 
created and emailed to a client. The QR code was also chosen for ease of use, mainly with respect for 
the client. With more well-known methods, such as facial recognition, a database of client faces would 
need to be created. Not only would this be extremely tedious, it could make clients who do not want 
their photograph taken feel awkward. Because it is a barcode, the QR Code can easily be decoded into 
strings that can be stored in a simple database, much easier than having to sort through a gallery of 
photographs. The last reason it was chosen was due to the availability of free software that is 
compatible with MATLAB. The current working code for the QR scanning can be found in the Appendix – 
QRCodeDetection.m. 
Process Explanation 
The system is designed to be as user friendly as possible. Due to the lack of a video display, audio cues 
are used to guide the visitor through the process. Once the robot reaches the lobby, it begins to scan for 
a QR code. The visitor, being told prior to the visit of the process, displays the code on their phone (or a 
piece of paper) in front of the camera. The system then scans and either confirms who they are or 
rejects the code. If it accepts, it will prompt the user with an audio cue, greeting them by name and 
asking to follow to the elevator. If it rejects, it will prompt the user to call and schedule an appointment, 
or to call for assistance.
QR Code System 
The versatility of the QR code allows for unique codes to be easily generated. This project takes 
advantage of this and allows someone to send out unique codes to all their expected visitors. These 
unique codes not only allow for personalization for visitors, but also allows for an added layer of 
security. The same code, while it could scan for anyone, could easily raise a flag when double scanned. 
The process of setting up the code is very simple in order to make sure that the entire ordeal is no more 
difficult than sending out a normal email. If it takes forever to set up the entire system, you wind up 
losing time just like having to have someone go greet the visitor in person, defeating the purpose of the 
project. 
First a code is created, whether it be randomly or by design, and assigned to someone’s name, like as 
shown in Table 2 below. The codes can be normally up to a length of 160 characters, but as shown in 
Figure 41, the larger the code the more complex the image becomes. While some cameras can 
accommodate the small image, the RGB camera on the Xtion Pro cannot consistently, so smaller 10 
character codes are used. The spreadsheet, which could be hosted on a website, or in a private folder on 
a service such as Dropbox, or for this proof of concept directly on the laptop, holds these codes and 
names. 
Table 2 - Database Example 
Tyler Aaron 13689482 
John Burchmore 2432158 
Ian DeOrio 59824 
Jeff Gao 209 
Figure 41 - (a) 10 character code vs (b) 100 character code 
47
Using the “xlsread()” function built into MATLAB, the program is able to easily take the spreadsheet and 
convert it into a useable matrix. Using a “for loop”, the program compares the value from the decoded 
image, to the matrix from the spreadsheet. If a code matches one found in the matrix, then the system 
considers the run a success, speaks aloud the person’s name and prompts the user to follow. In order to 
get the system to be able to read out a visitor’s name, MATLAB must use Microsoft Windows built in 
Text To Speech libraries. Through the use of a wrapper, written by Siyi Deng © 2009 [20], MATLAB is 
able to directly access these libraries and use them to read out strings, i.e., the corresponding names 
from the matrix. 
Code Decryption 
Because QR codes are not a closed source technology, there are many different programs and softwares 
available for people to use when it comes to encoding and decoding them. One of the most popular 
open source libraries is one developed by ZXing (“Zebra Crossing”). ZXing ("zebra crossing") is an open-source, 
multi-format 1D/2D barcode image processing library implemented in Java, with ports to other 
languages [21]. They have android clients, java clients, web clients, but unfortunately, do not have a 
simple way to incorporate into MATLAB. 
In order for MATLAB to properly use the ZXing libraries, it needs a wrapper. Fortunately, Lior Shapira 
created and published an easy to use wrapper for MATLAB [22]. Shapira’s wrapper consists of two 
simple MATLAB functions, “encode” and “decode”, which call to specific java files to either encode or 
decode a code. The “encode” function takes a string as an input and uses ZXing’s library to create a 
figure that displays the resulting QR. However, the figure produced is not the most visually appealing, 
hence why a different online service is used to generate the code to email to visitors. The “decode” 
function takes an image as an input, in this case captured with the RGB sensor on the camera, and 
outputs the resulting message. 
48
49 
Design Overview 
Codes and Standards 
The field of robotics is a new and emerging field and standards are still being put in place. Two very 
relevant ISO standards are below. 
“ISO/FDIS 13482: Safety requirements for personal care robots 
ISO/AWI 18646-1: Performance criteria and related test methods for service robot -- Part 1: Wheeled 
mobile servant robot” 
It is important to note that these standards are still in early stages of development and are constantly 
being updated and modified. Just like the project, they are a work in progress, and new standards and 
codes will be used as they are instituted.
50 
Model Concept 
Design 
Figure 42a shows a Creo Parametric 3D model of the planned design for the robot. Figure 42b shows the 
completed assembly. The goal of the design was simple yet secure. The model features a 1:1 scale 
iRobot Create1 and Xtion Pro2. The aluminum support bar3 can also be changed to a different size if need 
be. The assembled robot design changed slightly so the height of the Xtion Pro could be changed if 
needed. A web camera was also places on top of the aluminum support bar for QR code scanning and 
elevator detection. 
Figure 42 – a) 3D Creo Model of Robot System b) Final Proof of Concept Design 
The current housing, which can be seen in Figure 43, is designed to fit most small sized 
laptops/netbooks, but can be modified if the laptop to be used changes. The housing is round not only 
to match more aesthetically with the Create, but also to minimize sharp corners poking out from the 
system. The open design was chosen because the reduction in materials will keep the weight at a 
1 Courtesy of Keith Frankie at http://www.uberthin.com/hosted/create/ 
2 Courtesy of GrabCAD user Kai Franke at https://grabcad.com/library/asus-xtion-pro-live 
3 Courtesy of 80/20 Inc. at http://www.3dcontentcentral.com
minimum as well as because the fabrication of the system will be much easier without having to create a 
round wall. Each peg is screwed into the top and bottom plates. There are four smaller pegs that screw 
into the iRobot base and the bottom acrylic piece. The gap between all the pegs is roughly six inches, not 
enough to remove a laptop. Because of this, the back peg can be easily removed, the laptop slid in and 
replaced. 
Figure 43 – a) Creo Model of Housing b) Physical Model of Housing 
The laptop housing is made of two 0.25” thick clear acrylic pieces with a diameter of about 12”. The top 
and bottom pieces are essentially the same aside from the opening on the bottom piece for easy access 
to the iRobot Create’s buttons. The six pegs are made from 2” long threaded aluminum standoffs. This 
allows the pegs to be screwed on between the acrylic pieces. On the top piece of the acrylic housing the 
aluminum t-slotted extrusion extends about 36”, which is attached using two 90-degree brackets. On 
the t-slotted frame another 90-degree bracket is used to attach the Xtion 0.25” thick acrylic platform. 
This allows for varying the height of the Xtion camera. A close-up of the Xtion and its platform can be 
seen in Figure 44. The web camera on top of the aluminum extrusion is attached with a secured 90- 
degree bracket to allow for easy removal. The aluminum extrusion is also used for cord management. 
51
Figure 44 – a) Close up View of Web Camera b) Close up View of Xtion Pro and platform 
Between the iRobot Create and the bottom side of the laptop housing are four small nylon standoffs to 
allow for some spacing for cords and modularity, which can be seen in Figure 45. 
Figure 45 – a) Creo Model Close up of housing spacers b) Physical Model Close up of Housing Spacers 
Physics 
In order to calculate the center of gravity a couple assumptions need to be made. To start, the Asus 
Xtion Pro is assumed to be a uniform solid piece. In reality, it is not completely solid and its actual center 
of gravity could be different depending on how the internals are oriented. For ease of calculation, since 
its total weight and dimensions are known, the volume and in turn density of the camera can be 
calculated for the “solid uniform” piece. A similar assumption needs to be made for the iRobot Create. 
Since the exact weights and positions of all the internals of the robot are unknown, the robot is also 
assumed to be a uniform solid piece. Models of these two parts were created in PTC Creo and put into 
the assembly in order to do the overall center of gravity calculation. 
The known densities and calculated mass properties of all the materials used in the model can be seen 
in Table 3. PTC Creo uses these densities to calculate the volume and mass of each component. After 
running an analysis on the mass properties of the entire assembly, Creo outputs the x, y, and z 
coordinates for the center of mass of each component, as well as an overall center of gravity. 
52
Table 3 - Mass Properties of Model Assembly 
53 
Part 
Density 
(lb/in^3) 
Mass 
(lb) 
Center of Gravity (w/ respect to 
Create_Robot_Axis) 
X Y Z 
iRobot Create 0.0353 8.8000 0.0000 1.0831 0.8225 
Xtion Pro 0.0249 0.4784 0.0000 18.6422 0.0000 
Aluminum 
Support 0.0979 1.3553 0.0000 11.4080 0.0002 
Xtion Platform 0.0434 0.1049 0.0000 17.5326 0.0000 
Laptop Housing 0.0434 2.4994 0.0000 4.1820 -0.0074 
Base Spacer 1 0.0434 0.0010 4.3884 2.6580 -0.1022 
Base Spacer 2 0.0434 0.0010 4.3884 2.6580 -2.8522 
Base Spacer 3 0.0434 0.0010 -4.3884 2.6580 -0.1022 
Base Spacer 4 0.0434 0.0010 -4.3884 2.6580 -2.8522 
The center of gravity coordinates listed above are with respect to the main axis of the iRobot Create 
model. That axis is located at (0, 0, 0) directly in the bottom center of the create model as seen in Figure 
46. 
Figure 46 - iRobot Create Coordinate System
According to the Creo analysis, the calculated center of gravity for the entire assembly is located at: (X, 
Y, Z)(0.0000, 3.4898, 0.5448). These numbers do seem to make sense when looking at the model. 
With the direction of the x-axis being in the absolute center and all components being centered on the 
model, that x-coordinate would be zero. The y-coordinate is expected to be just slightly above the 
Create due to the extra weight above, and since the majority of the weight is in the Create itself. The z-coordinate 
should be slightly biased toward the front of the robot since more weight is located there. 
Parts List 
All of the major components and raw materials used in the design of the robot are readily available to 
consumers. The iRobot Create, Asus Xtion Pro, and Acer Ferrari are the three major components that are 
physically off the shelf products. The Creative Live Web Camera is also readily available off the shelf. The 
laptop housing and Xtion support materials required fabricating in the machine shop to the desired 
specifications. The acrylic pieces were all modeled in AutoCad and laser cut to the required 
specifications. A complete list of all parts and materials can be seen in Table 4. 
Table 4 - Final Parts List 
54
Simulation / Experimental Validation Plan 
55 
Description of the Test Bed 
Bearing Localization 
The Xtion Pro is to take a template photo about 80cm from the wall in the SAS lab; this is the goal 
orientation of the system. For the template photo, the robot is aligned with the wall such that the Xtion 
camera plane is parallel with the wall. This original orientation of the robot corresponds with a heading 
of zero degrees. With the template taken, MATLAB is then used to turn the robot to a random angle in 
the positive and/or negative direction. Because the field of view of the camera is 58 degrees, that 
bounds the random number. Using the function shown in the Localization-Bearing Template Method 
section, the robot then captures a new image to compare to the template image. Feature detection is 
used to determine any differences from the template and new images. The displacement of matched 
features is then related to an angle, and the robot will turn that angle back towards the zero degree 
heading. After the turn has been made, a new comparison image is taken and the comparison process 
repeated. If the measured pixel displacements between features are small enough, as explained in the 
validation section of the report, the process will end. If the resulting displacement is too large, the 
displacement will again be turned into an angle and the robot turned. The process is repeated until the 
system reaches that zero degree heading; within a certain level of uncertainty. 
Using the flat wall assumption program described in the Localization-Bearing section of this report, 
another method of determining the pose of the robot is performed. The robot is set at a distance of 
80cm from the wall, with an orientation so that the plane of the Xtion camera is parallel to the wall. This 
validation will only work efficiently if the robot is facing a wall without any major protrusions. The robot 
is then commanded to turn a random number of degrees in the positive and/or negative direction. As 
described earlier, the program takes measurements of two different points, one on either side of the 
robot. It then uses trigonometry to calculate the angle to turn in order to make both those 
measurements equal. The program runs in a loop until the measurements are determined to be equal, 
and the validation is over. The corrected angle is output in MATLAB, which can be compared to the 
initial angle of offset.
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring
10-Report-Spring

More Related Content

What's hot

Spi research paper
Spi research paperSpi research paper
Spi research paperQuyenVu47
 
Ford c max uputstvo
Ford c max uputstvoFord c max uputstvo
Ford c max uputstvosikolaorline
 
Artromick Mcw Manual for Hospital Computing Solutions
Artromick Mcw Manual for Hospital Computing SolutionsArtromick Mcw Manual for Hospital Computing Solutions
Artromick Mcw Manual for Hospital Computing SolutionsArtromick
 
AppsWatch User Guide
AppsWatch User GuideAppsWatch User Guide
AppsWatch User GuideNRG Global
 
573916%20 obd1150%20mnl en%20rev%20b_0auto
573916%20 obd1150%20mnl en%20rev%20b_0auto573916%20 obd1150%20mnl en%20rev%20b_0auto
573916%20 obd1150%20mnl en%20rev%20b_0autossuser058892
 
Motorola mtm5400 197380
Motorola mtm5400 197380Motorola mtm5400 197380
Motorola mtm5400 197380ivan ion
 
Fall protection in construction
Fall protection in constructionFall protection in construction
Fall protection in constructionHatem Nasri
 

What's hot (8)

Spi research paper
Spi research paperSpi research paper
Spi research paper
 
Ford c max uputstvo
Ford c max uputstvoFord c max uputstvo
Ford c max uputstvo
 
Artromick Mcw Manual for Hospital Computing Solutions
Artromick Mcw Manual for Hospital Computing SolutionsArtromick Mcw Manual for Hospital Computing Solutions
Artromick Mcw Manual for Hospital Computing Solutions
 
AppsWatch User Guide
AppsWatch User GuideAppsWatch User Guide
AppsWatch User Guide
 
ESM 101 for ESM 6.8c
ESM 101 for ESM 6.8cESM 101 for ESM 6.8c
ESM 101 for ESM 6.8c
 
573916%20 obd1150%20mnl en%20rev%20b_0auto
573916%20 obd1150%20mnl en%20rev%20b_0auto573916%20 obd1150%20mnl en%20rev%20b_0auto
573916%20 obd1150%20mnl en%20rev%20b_0auto
 
Motorola mtm5400 197380
Motorola mtm5400 197380Motorola mtm5400 197380
Motorola mtm5400 197380
 
Fall protection in construction
Fall protection in constructionFall protection in construction
Fall protection in construction
 

Similar to 10-Report-Spring

REPORT IBM (1)
REPORT IBM (1)REPORT IBM (1)
REPORT IBM (1)Hamza Khan
 
White Paper: Indoor Positioning in Offices & Smart Buildings
White Paper: Indoor Positioning in Offices & Smart BuildingsWhite Paper: Indoor Positioning in Offices & Smart Buildings
White Paper: Indoor Positioning in Offices & Smart Buildingsinfsoft GmbH
 
Smart attendance system using facial recognition
Smart attendance system using facial recognitionSmart attendance system using facial recognition
Smart attendance system using facial recognitionVigneshLakshmanan8
 
White Paper Indoor Positioning in Healthcare
White Paper Indoor Positioning in HealthcareWhite Paper Indoor Positioning in Healthcare
White Paper Indoor Positioning in Healthcareinfsoft GmbH
 
Guia de usuario arena
Guia de usuario arenaGuia de usuario arena
Guia de usuario arenaSadamii Rap
 
Uni fi controller_ug
Uni fi controller_ugUni fi controller_ug
Uni fi controller_ugjoko
 
Leverege Intro to IOT ebook
Leverege Intro to IOT ebookLeverege Intro to IOT ebook
Leverege Intro to IOT ebookFab Fusaro
 
Guia definitiva de shodan
Guia definitiva de shodanGuia definitiva de shodan
Guia definitiva de shodannoc_313
 
Uni v e r si t ei t
Uni v e r si t ei tUni v e r si t ei t
Uni v e r si t ei tAnandhu Sp
 
Agentless Monitoring with AdRem Software's NetCrunch 7
Agentless Monitoring with AdRem Software's NetCrunch 7Agentless Monitoring with AdRem Software's NetCrunch 7
Agentless Monitoring with AdRem Software's NetCrunch 7Hamza Lazaar
 
General Maintenance Standards V0.6
General Maintenance Standards V0.6General Maintenance Standards V0.6
General Maintenance Standards V0.6Bart Den Tijn
 
Work Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel BelaskerWork Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel BelaskerAdel Belasker
 
Bike sharing android application
Bike sharing android applicationBike sharing android application
Bike sharing android applicationSuraj Sawant
 
final report (parking project).pdf
final report (parking project).pdffinal report (parking project).pdf
final report (parking project).pdfgamefacegamer
 
Smart dsp os_user_guide
Smart dsp os_user_guideSmart dsp os_user_guide
Smart dsp os_user_guideeng_basemm
 

Similar to 10-Report-Spring (20)

REPORT IBM (1)
REPORT IBM (1)REPORT IBM (1)
REPORT IBM (1)
 
White Paper: Indoor Positioning in Offices & Smart Buildings
White Paper: Indoor Positioning in Offices & Smart BuildingsWhite Paper: Indoor Positioning in Offices & Smart Buildings
White Paper: Indoor Positioning in Offices & Smart Buildings
 
Smart attendance system using facial recognition
Smart attendance system using facial recognitionSmart attendance system using facial recognition
Smart attendance system using facial recognition
 
White Paper Indoor Positioning in Healthcare
White Paper Indoor Positioning in HealthcareWhite Paper Indoor Positioning in Healthcare
White Paper Indoor Positioning in Healthcare
 
FPGA
FPGAFPGA
FPGA
 
Guia de usuario arena
Guia de usuario arenaGuia de usuario arena
Guia de usuario arena
 
Uni fi controller_ug
Uni fi controller_ugUni fi controller_ug
Uni fi controller_ug
 
Final Report
Final ReportFinal Report
Final Report
 
Leverege Intro to IOT ebook
Leverege Intro to IOT ebookLeverege Intro to IOT ebook
Leverege Intro to IOT ebook
 
Guia definitiva de shodan
Guia definitiva de shodanGuia definitiva de shodan
Guia definitiva de shodan
 
Uni v e r si t ei t
Uni v e r si t ei tUni v e r si t ei t
Uni v e r si t ei t
 
Plc report
Plc report Plc report
Plc report
 
Agentless Monitoring with AdRem Software's NetCrunch 7
Agentless Monitoring with AdRem Software's NetCrunch 7Agentless Monitoring with AdRem Software's NetCrunch 7
Agentless Monitoring with AdRem Software's NetCrunch 7
 
General Maintenance Standards V0.6
General Maintenance Standards V0.6General Maintenance Standards V0.6
General Maintenance Standards V0.6
 
PLC & SCADA
PLC & SCADA PLC & SCADA
PLC & SCADA
 
Srs
SrsSrs
Srs
 
Work Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel BelaskerWork Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel Belasker
 
Bike sharing android application
Bike sharing android applicationBike sharing android application
Bike sharing android application
 
final report (parking project).pdf
final report (parking project).pdffinal report (parking project).pdf
final report (parking project).pdf
 
Smart dsp os_user_guide
Smart dsp os_user_guideSmart dsp os_user_guide
Smart dsp os_user_guide
 

10-Report-Spring

  • 1. Design of a Robot Guide/Usher Project 10 Drexel University | 3141 Chestnut Street, Philadelphia, PA 19104 1 Adviser M. Ani Hsieh Design Team Tyler Aaron John Burchmore Ian DeOrio Jeff Gao Drexel University Mechanical Engineering and Mechanics Senior Design MEM 493 Spring 14
  • 2. 2 Abstract Several Mechanical Engineering and Mechanics (MEM) research labs are located within the Science Center, but access is restricted only to ID card holders. As a direct result, visitors and a few lab members require the assistance of an ID card holder to gain access to the building every visit. While the option of employing a dedicated ID card usher is plausible, it just isn’t very feasible and the job itself would be very menial. Autonomous robots with the ability to identify and guide human visitors would solve this issue. Designed as a case study with broader implications, this project will develop a robotic guide/usher to perform this task.
  • 3. Table of Contents Abstract ......................................................................................................................................................... 2 List of Figures ................................................................................................................................................ 6 List of Tables ................................................................................................................................................. 8 Introduction .................................................................................................................................................. 9 Stakeholders and Needs ............................................................................................................................. 10 Problem Statement ..................................................................................................................................... 11 Methodology ............................................................................................................................................... 12 Background ............................................................................................................................................. 12 Perception Subsystem: Asus Xtion Pro ................................................................................................... 13 Mobility Subsystem: iRobot Create ........................................................................................................ 15 Manipulation Subsystem: Keycard ......................................................................................................... 16 Control Architecture Subsystem............................................................................................................. 17 Mapping ............................................................................................................................................. 17 Background .................................................................................................................................... 17 Process ........................................................................................................................................... 18 Localization ......................................................................................................................................... 22 Theory ............................................................................................................................................ 22 Background ................................................................................................................................ 22 Edge/Corner Detection .............................................................................................................. 23 Blob Detection ........................................................................................................................... 24 Translational (XYZ) ......................................................................................................................... 25 Background ................................................................................................................................ 25 Object Detection ....................................................................................................................... 25 Implementation ......................................................................................................................... 27 Bearing ........................................................................................................................................... 29 Overview .................................................................................................................................... 29 Flat Wall Assumption ................................................................................................................. 29 Template Method ...................................................................................................................... 30 Template Method - Processing ................................................................................................. 31 Obstacle Detection ............................................................................................................................. 36 Process ........................................................................................................................................... 36 Obstacle Avoidance ............................................................................................................................ 39 Theory ............................................................................................................................................ 39 Execution........................................................................................................................................ 40 Elevator Navigation ............................................................................................................................ 42 3
  • 4. Process ........................................................................................................................................... 42 Exit Elevator ................................................................................................................................... 45 Subject Identification ......................................................................................................................... 46 Process Explanation ....................................................................................................................... 46 QR Code System ............................................................................................................................. 47 Code Decryption ............................................................................................................................ 48 Design Overview ..................................................................................................................................... 49 Codes and Standards .......................................................................................................................... 49 Model Concept ................................................................................................................................... 50 Design............................................................................................................................................. 50 Physics ............................................................................................................................................ 52 Parts List ......................................................................................................................................... 54 Simulation / Experimental Validation Plan ................................................................................................. 55 Description of the Test Bed .................................................................................................................... 55 Bearing Localization ........................................................................................................................... 55 Translational XYZ Localization ............................................................................................................ 56 Subject Identification ......................................................................................................................... 56 Elevator .............................................................................................................................................. 57 Object Avoidance/Detection .............................................................................................................. 57 Validation Plan ........................................................................................................................................ 57 Bearing ............................................................................................................................................... 57 Translational XYZ ................................................................................................................................ 58 Subject Identification ......................................................................................................................... 59 Elevator .............................................................................................................................................. 60 Object Avoidance/Detection .............................................................................................................. 60 Validation Data ....................................................................................................................................... 60 Rotational Localization ....................................................................................................................... 60 Translational XYZ ................................................................................................................................ 61 Guest Recognition .............................................................................................................................. 62 Elevator .............................................................................................................................................. 63 Obstacle Avoidance ............................................................................................................................ 63 Context and Impact ..................................................................................................................................... 65 Economic Analysis .................................................................................................................................. 65 Environmental Analysis .......................................................................................................................... 66 Social Impact Analysis ............................................................................................................................. 66 Ethical Analysis ....................................................................................................................................... 67 Discussion/Future Work ............................................................................................................................. 68 4
  • 5. Conclusions ............................................................................................................................................. 68 Future Work............................................................................................................................................ 69 Hardware Upgrades ........................................................................................................................... 69 Facial Recognition .............................................................................................................................. 69 Speech Recognition ............................................................................................................................ 71 Speaker Recognition ........................................................................................................................... 72 Elevator .............................................................................................................................................. 73 Service Panel .................................................................................................................................. 73 Elevator Door Detection ................................................................................................................ 73 Elevator Speech Recognition ..................................................................................................... 73 Template Matching ................................................................................................................... 73 Project Management .................................................................................................................................. 75 SVN Repository ....................................................................................................................................... 75 References .................................................................................................................................................. 76 Appendices .................................................................................................................................................. 80 Appendix A ............................................................................................................................................. 80 Appendix B .............................................................................................................................................. 92 Appendix C .............................................................................................................................................. 93 Appendix – QRCodeDetection.m .......................................................................................................... 101 Appendix – CreateTemplate.m ............................................................................................................. 103 Appendix – RunComparison.m ............................................................................................................. 104 Appendix – XYZTemp_Angle.m ............................................................................................................ 106 Appendix – Xlocalization_Angle.m ....................................................................................................... 108 Appendix – ObstacleDetection.m ......................................................................................................... 110 Appendix – PotFieldDrive.m ................................................................................................................. 114 Appendix – ButtonDetection.m ............................................................................................................ 117 Appendix – ExitElevator.m ................................................................................................................... 120 Appendix – Demo.m ............................................................................................................................. 121 Appendix – QRLocal.m .......................................................................................................................... 125 Appendix – Rlocal.m ............................................................................................................................. 127 Appendix – Init.m ................................................................................................................................. 130 Appendix – DepthTemp.m .................................................................................................................... 131 Appendix – parsepoints.m .................................................................................................................... 132 5
  • 6. 6 List of Figures Figure 1 - OpenNI SDK Architecture [3] ...................................................................................................... 13 Figure 2 - Occupancy Grid with obstacles [8] ............................................................................................. 18 Figure 3 - Occupancy grid mapping code .................................................................................................... 19 Figure 4 - Sample hallway map ................................................................................................................... 19 Figure 5 - SetLoc code ................................................................................................................................. 20 Figure 6 - SetGoal code ............................................................................................................................... 20 Figure 7 - AdjustLoc code ............................................................................................................................ 21 Figure 8 - FindDel code ............................................................................................................................... 21 Figure 9 – Microsoft Kinect Subcomponents .............................................................................................. 22 Figure 10 – ASUS Xtion Pro Specifications under OpenNI .......................................................................... 22 Figure 11 - Harris Feature Extraction .......................................................................................................... 23 Figure 12 - SURF Feature Extraction ........................................................................................................... 24 Figure 13 - Sample Code ............................................................................................................................. 25 Figure 14 - Matched Points (Including Outliers) ......................................................................................... 26 Figure 15 - Matched Points (Excluding Outliers)......................................................................................... 26 Figure 16. Localization "checkpoints" ......................................................................................................... 27 Figure 17 - Overlayed images of a template versus current ....................................................................... 28 Figure 18 - Flat wall assumption ................................................................................................................. 29 Figure 19 - Unknown pose .......................................................................................................................... 30 Figure 20 - Example of a template captured by the Xtion Pro.................................................................... 31 Figure 21 - Code application of the functions ............................................................................................. 32 Figure 22 - Example of Harris features extracted from an image ............................................................... 32 Figure 23 - Code application of the functions ............................................................................................. 33 Figure 24 - Result of identifying the Harris features and finding matching features in the images ........... 34 Figure 25 - Visual representation of the relative pixel displacements used to find the function .............. 34 Figure 26 - Mapped Depth Map of a Hallway ............................................................................................. 36 Figure 27 - Respective Image Capture of Depth Capture ........................................................................... 36 Figure 28 - Ground Plane Extraction ........................................................................................................... 37 Figure 29 - Obstacle Extraction ................................................................................................................... 38 Figure 30 - Colored Map with Obstacles ..................................................................................................... 38 Figure 31 - Example of valid and invalid path from grid pathing [16] ........................................................ 39 Figure 32 - Vector field analysis and path of movement from Potential Field method ............................. 40 Figure 33 - Image Acquisition ..................................................................................................................... 42 Figure 34 - Elevator Panel Template ........................................................................................................... 43 Figure 35 - Template Matching with the Captured Image .......................................................................... 43
  • 7. Figure 36 - Cropping Matrix ........................................................................................................................ 44 Figure 37 - First Floor Crop ......................................................................................................................... 44 Figure 38 - Black and White Conversion ..................................................................................................... 44 Figure 39 - NNZ Function ............................................................................................................................ 44 Figure 40 - Standard QR Code ..................................................................................................................... 46 Figure 41 - (a) 10 character code vs (b) 100 character code ...................................................................... 47 Figure 42 – a) 3D Creo Model of Robot System b) Final Proof of Concept Design ................................... 50 Figure 43 – a) Creo Model of Housing b) Physical Model of Housing ...................................................... 51 Figure 44 – a) Close up View of Web Camera b) Close up View of Xtion Pro and platform ..................... 52 Figure 45 – a) Creo Model Close up of housing spacers b) Physical Model Close up of Housing Spacers .................................................................................................................................................................... 52 Figure 46 - iRobot Create Coordinate System ............................................................................................ 53 Figure 47 - Printed Degree Chart for Validation ......................................................................................... 58 Figure 48 - Robot Localization Testing ........................................................................................................ 59 Figure 49 - Object Avoidance Example ....................................................................................................... 60 Figure 50 - Outline of facial recognition system [24] .................................................................................. 70 Figure 51 - Rotating Elevator Bank Views ................................................................................................... 74 Figure 52 - Flow Chart – Elevator to Lobby and Back ................................................................................. 88 Figure 53 - Flow Chart - Elevator to Room .................................................................................................. 89 Figure 54 - Fall Term Gantt Chart ................................................................................................................ 90 Figure 55 – Winter Term Gantt Chart ......................................................................................................... 91 Figure 56 - Detail Model Drawing ............................................................................................................... 92 7
  • 8. 8 List of Tables Table 1 - Project Needs ............................................................................................................................... 10 Table 2 - Database Example ........................................................................................................................ 47 Table 3 - Mass Properties of Model Assembly ............................................................................................ 53 Table 4 - Final Parts List .............................................................................................................................. 54 Table 5 - Overall Rotational Localization Data ............................................................................................ 61 Table 6 - Overall Z Depth localization Data ................................................................................................. 61 Table 7 - Overall X Displacement Localization Data .................................................................................... 62 Table 8 - Overall Guest Recognition Data ................................................................................................... 63 Table 9 - Overall Object Avoidance Data .................................................................................................... 63 Table 10 - Parts List for Assembly ............................................................................................................... 80 Table 11 - Budget Table .............................................................................................................................. 81 Table 12 - Decision Matrix .......................................................................................................................... 82 Table 13 - Project Needs (1 of 2) ................................................................................................................. 82 Table 14 - Project Needs (2 of 2) ................................................................................................................. 83 Table 15 - Specifications and Metrics (1 of 2) ............................................................................................. 85 Table 16 - Specifications and Metrics (2 of 2) ............................................................................................. 86 Table 17 - Event Breakdown ....................................................................................................................... 86 Table 18 - Complete Z Depth Localization Trial Data .................................................................................. 93 Table 19 - Complete X Displacement Trial Data (1/2)................................................................................. 94 Table 20: Complete X Displacement Trial Data (2/2) .................................................................................. 94 Table 21 - Complete X Displacement Trial Data (2/2)................................................................................. 95 Table 22 - Complete Rotational Localization Trial Data (1/4) ..................................................................... 96 Table 23 - Complete Rotational Localization Trial Data (2/4) ..................................................................... 97 Table 24 - Complete Rotational Localization Trial Data (3/4) ..................................................................... 98 Table 25 - Complete Rotational Localization Trial Data (4/4) ..................................................................... 99 Table 26 - Complete Obstacle Avoidance Trial Data ................................................................................... 99 Table 27 - Complete QR Code Reader Trial Data ...................................................................................... 100
  • 9. 9 Introduction Many movies today regarding the future feature robots living, working, and interacting with humans on a daily basis. Often the programming and design of some of these robots have reached the level of near complete autonomy, as seen in the movie “iRobot”. Autonomous mobile robots are developed to perform tasks with little to no help from humans. This means that a robot will have to be able to complete tasks given to it without becoming stuck either physically or in a programming loop. The use of these robots has the potential to simplify and improve the quality of life for everyone on the planet. These types of robots are not just a far off dream. Current work in robotics is progressing to the point that robots play active roles in the current workforce. Robots are being used for manufacturing, healthcare, and service industries with plans for application in military and space programs [1]. Although the robots of today aren’t what one would expect when the term “robot” is used, the tasks performed are similar. There have been a growing number of instances where robots have been used to interact with humans over the years. The Smithsonian Museum of American History installed MINERVA mobile robots that acted as tour guides for certain exhibits. MINERVA was designed to interact with groups of guests and provide information on the current exhibits [2]. California hospitals are purchasing RP-VITA telepresence robots for doctors to be able to check on patients from home. The doctor connects to the robots monitor and can then automatically drive to rooms or take manual control of to get closer views. Amazon’s shipping warehouses utilize Kiva robots to transport shelves and boxes at the request of a human worker. The human will be at a command terminal giving prompts to the robot while they perform the tasks. All these robots have demonstrated the continuing development of the robotics field. A problem with some of the current market robots is the minimal human interaction with the robots while they carry out their tasks. The Kiva robots work very well in factory settings, however human interaction is kept to a command terminal basis. The telepresence robots are good for long distance communication of doctors and patients however many accounts say that it took some time to get used to having their doctor just be a screen in front of them. Additionally a nurse or assistant practitioner has to follow these robots in case extra interaction with the patients is needed. The MINERVA tour guides showed that robots could adapt and show a general understanding of simple emotion in a group. This robot did work well, but the outward design was on the functional side and not aesthetically pleasing. Now that the technology allowing robots to perform simple tasks is growing, development on the human interaction side is needed to make them more compatible in normal society. Responding to voice commands instead of computer prompts could be the precursor to simple speech. The design of an autonomous robotic guide in a building can clearly exemplify the steps necessary for successful human robot interaction.
  • 10. 10 Stakeholders and Needs Needs are created from stakeholder input and are used to guide the project in the correct direction. The current stakeholders include Science Center MEM Lab members, visitors and building management, however as the project progresses, more stakeholders may become involved. Table 1 below shows four of the overarching needs relevant to the project. Table 1 - Project Needs Design Category Need S1 Safety Potential for injuries during normal operation mitigated Op1 Operation Minimize need for human input P1 Programming Robust guest detection P2 Programming Modular software components The needs listed above are all important because they are key driving factors that shape the design. Safety is of the utmost concern whenever humans interact with robots, with is addressed by S1. The goal of the project is to create an autonomous system, therefore minimizing the need for superfluous human input, which is represented by OP1. Guest verification is a need expressed directly by the stakeholders. P2 is a need expressed by team members to introduce modularity and increase operating efficiency. A full list of needs can be found at Table 13 and Table 14 in Appendix A.
  • 11. 11 Problem Statement Certain secure buildings have workers take their minds off of their work to retrieve/sign in guests that arrive. This wastes precious time and effort that the workers could be putting towards finishing up projects. Also, many secure buildings now have multiple security checkpoints that further slow the process of picking up guests. If this task could be given to a robot instead, company workers could continue working until the guests are delivered to the designated area. The core mission of this senior design project is to design and build a proof of concept for a fully autonomous robotic guide/usher. The robot will have to be able to work its way through hallways, elevators, and lobbies while avoiding stationary obstacles and variably moving objects. Elevator controls will have to be operable through some means. The robot will also have to be able to recognize the appropriate guests that will be delivered the specific destination. There could be two or three different groups of guests waiting to be picked up for different departments in said building. Once delivery of guests occurs, the robot should move to a designated wait location for its next command.
  • 12. 12 Methodology Background The robotic guide/usher in question will be an autonomous robot; comprised of perception, mobility, manipulation, and control architecture systems. The perception subsystem is used to distinguish and extract information from within the robot’s “visual” range. The mobility subsystem controls the robot’s locomotion in terms of heading and velocity. The manipulation subsystem in this particular case consists of the keycard system. The control architecture subsystem would be used as the central management and control console with regards to mobility and guest verification. These four subsystems in combination with each other form the structure that the autonomous robot relies on to complete the tasks assigned to it.
  • 13. Perception Subsystem: Asus Xtion Pro Modern day machine vision systems employ multiple vision sensors working concurrently to capture depth, image, and audio for post-processing and general interpretation. The most basic suite of sensors utilized in order to enable vision for autonomous systems typically include a depth camera and a RGB (Red/Green/Blue) camera. Depth cameras come mainly in two different variants. Time-of-flight (ToF) depth cameras are based on utilizing the phase difference between the emitted and reflected IR signal in order to calculate the depth of a targeted object. Structured pattern depth cameras emit a structured IR pattern onto target objects and then utilize triangulation algorithms to calculate depth. Standard RGB cameras are used to provide color vision of the environment. Industrial grade depth cameras are often very expensive and cost anywhere from $3000 to $9000. RGB CMOS color cameras are an established and mature technology and therefore inexpensive. The integration of the Xtion Pro sensor to the system provides a cheap and established solution as a complete vision package. The Asus Xtion Pro is a motion sensing input device for use with the Windows PCs and is built from the same hardware as the Microsoft Kinect. From a developmental standpoint, utilizing the Asus Xtion Pro as the system’s vision suite is extremely cost efficient due to the fact that the Asus Xtion Pro contains an RGB camera, depth sensor, and a multi-array microphone all within one package. The integration of the Asus Xtion Pro is a complex and multilayered issue that involves a multitude of techniques, and modern engineering tools for successful integration. MATLAB is the desired Integrated Development Environment (IDE) because of staff and team familiarity. For integration of the Xtion Pro vision system, several software components need to be put in place to facilitate accurate data capture. Figure 1 - OpenNI SDK Architecture [3] 13
  • 14. In Figure 1 above, the Xtion Pro sensor is represented by the “Depth Physical Acquisition”. This step of the Software Development Kit (SDK) architecture constitutes the physical sensor’s ability to capture raw data. Next, the PrimeSense SoC operates the underlying hardware by performing functions such as depth acquisition dedicated calculations, matching depth and RGB images, down sampling, and various other operations. Then, the OpenNI framework takes over as the most popular open source SDK for use in the development of 3D sensing middleware libraries and applications. From there, a C++ wrapper was found which would allow the use of MATLAB as the primary IDE. The proper method for creating a data processing system for the Xtion Pro involves numerous steps. A Kinect MatLAB C++ wrapper was developed by Dirk-Jan Kroon from Focal Machine Vision en Optical Systems in January 31st, 2011. [4] This particular MATLAB C++ wrapper is utilized alongside OpenNI 2.2.0, NITE 2.2.0, Microsoft Kinect SDK v1.7, and Microsoft Visual C++ compiler to create the functional data processing system. 14
  • 15. Mobility Subsystem: iRobot Create An important part of a mobile autonomous robot is the components that allow for locomotion. This motion system changes the system’s velocity and trajectory via a closed-loop control. There are two specific types of motion systems that need to be considered. Limb based locomotion and wheel based locomotion are the two most common types of motion systems. The iRobot Corporation, the makers of the iRobot Roomba, created the iRobot Create Programmable Robot as a mobile robot platform for educators, students, and developers as a part of their contribution and commitment to the Science, Technology, Engineering, and Math (STEM) education program. Utilizing the iRobot Create as the motion system is the most straightforward cost efficient solution. The iRobot Create features three wheels, a designated cargo bay, 6-32 mounting cavities, an omnidirectional IR receiver, sensors, a rechargeable 3000mAh battery, and a serial port for communication. By using this platform, the overall focus of the project can be turned toward developing new functionalities between the iRobot Create and the Asus Xtion Pro without having to worry about mechanical robustness or low-level 15 control. The iRobot Create uses an Open Interface (OI) comprised of an electronic interface as well as a software interface for programmatically controlling the Create’s behavior and data collection capabilities. The Create communicates at 57600 baud via a numeric command system. For example, the command code that commands the Create to drive in reverse at a velocity of -200mm/s while turning at a radius of 500mm is [137] [255] [56] [1] [244] [5]. It becomes clear that this command code structure is unintuitive and unwieldy. In order to resolve and simplify this issue, MATLAB can be used to better facilitate communication between a laptop and the iRobot Create via a RS-232 to USB converter. MATLAB is the desired Integrated Development Environment (IDE) due to staff and team familiarity. Therefore, MATLAB will be used to issue commands to alter the robot’s heading and velocity. The MATLAB Toolbox for the iRobot Create (MTIC) replaces the native low-level numerical drive commands embedded within the iRobot Create with high level MATLAB command functions that act as a “wrapper” between MATLAB and the Create [6].
  • 16. Manipulation Subsystem: Keycard The Science Center building is a secure facility. Therefore in order to operate the elevators, a keycard must be swiped over a reader to grant access. A human being will always be accompanying the robot when it enters the elevator, eliminating the need for a complex swiping mechanism. Instead, the robot will utilize a heavy duty retractable key reel system. When entering the elevator, the robot will prompt the human via an audio cue, to swipe and press the corresponding elevator button. The human accompanying the robot will be able to pull the card to the proper location to swipe, then release the card, which will retract back into its starting location. In order to prevent possible theft, the reel will use heavy duty Kevlar wire fixed to robot. This design be the fastest method to swipe, and will prevent delays for any other people in the elevator. 16
  • 17. Control Architecture Subsystem 17 Mapping Background Mapping is a fundamental issue for a mobile robot moving around in space. A map is relied on for localization, path planning, and obstacle avoidance. A robot needs a map when performing localization in order to give itself a location on that map. Otherwise, the robot could be at any location in an infinite space. According to DeSouza and Kak’s article on vision-based robot navigation systems, three broad groups in which indoor vision-based robot navigation systems can be categorized into are map-based navigation, map-building-based navigation, and map-less navigation. Map-based navigation systems are reliant on geometric modeling to element modeling of the environment, which the navigation system can rely on. Map-building navigation systems utilize the onboard sensors to actively generate the modeling of the environment for use with active navigation. Map-less navigation systems are typically more sophisticated systems based on recognizing objects found in the environment in order to orient and continue along their path. Several challenges arise when mapping. To start, the area of all possible maps is infinitely large. Even when using a simple grid there can be a multitude of different variables. This can lead to very large and complex maps. Noise in the robot’s sensors also poses a problem. As a robot is navigating around errors accumulate, which can make it difficult to obtain an accurate position. Also, according to the Probabilistic Robotics textbook [7], when different places look very similar, for instance in an office hallway, it can be very difficult to differentiate between places that have already been traveled at different points in time, otherwise known as perceptual ambiguity. Challenges aside, depending on the type of navigation system chosen, the complexity and sophistication of sensors needed varies. 3-dimensional mapping can be more complicated and is not always needed depending on the application. Instead a 2-dimensional map or occupancy grid can be created. “The basic idea of the occupancy grids is to represent the map as a field of random variables, arranged in an evenly spaced grid [7]“. The random variables are binary and show if each specific location is occupied or empty. The main equation that makes up occupancy grid mapping calculates the posterior over maps with available data, as seen in Equation 1. 푝(푚 |푧1:푡 , 푥1:푡 ) Equation 1
  • 18. Here, m is the overall map, z1:t is the set of all measurements taken up to time t, and x1:t is the set of all the robots poses over time t. The set of all poses make up the actual path the robot takes. The occupancy grid map, m, is broken up into a finite number of grid cells, mi as seen in Equation 2. 푚 = Σ푚푖 푖 Equation 2 Each grid cell has a probability of occupation assigned to it: p(mi=1) or p(mi=0). A value of 1 corresponds to occupied, and 0 corresponds to free space. When the robot knows its pose and location on the occupancy grid, it is able to navigate due to the fact that it knows where obstacles are since they are marked as a 1. Figure 2 shows a basic occupancy grid with a robot located at the red point, the white points are empty space, and the black spaces are walls or obstacles. Figure 2 - Occupancy Grid with obstacles [8] The obstacles on the occupancy grid are considered to be static only. These obstacles are considered to be stationary for the particular iteration that they are being used. Dynamic obstacles are considered to be within this classification; however the robustness of the algorithm to calculate the optimal path around these obstacles needs further investigation. Process In order to implement the 2D occupancy grid into MATLAB, a simple 2D array is needed. As an example a simple straight hallway can be mapped. The empty space is made up of a unit 8 variable type array of 0’s, while the walls are made up of an array of 1’s. The map, m, is comprised of both the space and the walls. Example code for a simple map can be seen in Figure 3. 18
  • 19. Figure 3 - Occupancy grid mapping code In this basic code, an empty hallway is mapped using the basis of a 2D occupancy grid as described previously. This hallway is about 5.5 feet (170cm) by about 10 feet (310cm). Here the walls are 15cm thick, which was just chosen to give the walls some thickness. The unit8 variable type is used to minimize memory usage while running the code. This is important since a much larger occupancy grid will have many more data points. A basic example of what the occupancy grid of zeros and ones would look like for a straight hallway can be seen in Figure 4. 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 Figure 4 - Sample hallway map Now that the basic example of the occupancy grid is defined, further steps can be taken to convert a fully 3D space into a functional 2D occupancy grid. This occupancy grid can be used to keep track of goals, positions, and obstacles. In order to clean up the syntax for map operations, several functions must be defined. The first function is the setLoc.m function. The code for this is shown on the next page. 19
  • 20. Figure 5 - SetLoc code The SetLoc function accepts three inputs and outputs the modified m variable. The first input is the initial map variable, the second input is the value of the y location in centimeters, and finally the last input is the value of the x location in centimeters. Next, the function locations the last known x and y position of the robot and resets this location to 0. And finally, it uses the two input location variables to set the new position variable before updating the m output variable. This function allows for efficient and clean redefinition of the robot’s position on the map. The next function is the setGoal.m function. The code for this particular function is shown below. Figure 6 - SetGoal code The SetGoal function accepts three inputs and outputs the modified m variable. The first input is the initial map variable, the second input is the value of the y goal location in centimeters, and finally the last input is the value of the x goal location in centimeters. Similar to the SetLoc function, the function’s next step is to locate the previous goal location and to reset this location’s occupancy grid value to 0. And finally, it redefines the new goal position’s occupancy grid value to 3. This designates the goal. 20
  • 21. The next function is the AdjustLoc function. The code for this particular function is shown below. Figure 7 - AdjustLoc code The AdjustLoc function again accepts three inputs and outputs the modified m variable. The first input is the initial map variable to be adjusted, the second input is the centimeter value of adjustment of the y position, and the third input is the centimeter value of adjustment of the x position. Similar to the SetLoc function, the function’s next step is to locate the previous goal location and then to reset the previous location’s occupancy grid value to 0. Finally, it calculates the new position by summing the old position with the adjustments specified in the inputs. This allows for adjustment of the position of the robot due to localization algorithms that will be defined later on. The last function for performing operations on the map is the FindDel.m function. The code for this particular function is shown below. Figure 8 - FindDel code The FindDel function simply calculates the distance in Y and X between the goal and the current position of the robot. It then converts this distance from centimeters to millimeters so that the input to the Roomba is simplified. 21
  • 22. 22 Localization Theory Background “Mobile robot localization is the problem of determining the pose of a robot relative to a given map of the environment [9]”. Otherwise known as position tracking or position estimation, the problem of determining the position and pose of a robot is a fundamental problem that needs to be solved before any level of autonomy can be claimed. When given a map of the surrounding environment, the robot must be able to traverse the environment to the desired goal. However, without accurate and timely position data, the robot can quickly accumulate error leading to undesirable actions. However, there are many steps before successful self-localization can be accomplished. Because the Asus Xtion Pro has been selected as the platform from which the data from the surrounding environment will be collected, there are immediate parameters and values from which things can be worked around are needed. Figure 9 – Microsoft Kinect Subcomponents Figure 10 – ASUS Xtion Pro Specifications under OpenNI
  • 23. Utilizing the IR Depth Sensor, the IR Emitter, and the Color Sensor along with their corresponding components, the data captured by the various sensors can be used. It is important to mention that both position and pose need to be approximated using the information derived from the sensors. As such, the next two sections will detail the various techniques used to calculate this information with respect to the global coordinate system. However, before position and pose can be solved for, the basic algorithms that facilitate this capability need to be detailed first. It is important to note that the ASUS Xtion Pro is the developer’s version of the Microsoft Kinect. It is a compact version with the exact same hardware. Edge/Corner Detection One of the feature extraction methods utilized is the corner detection algorithm named Harris & Stephens’s method. This particular algorithm is a further improvement on Moravec’s corner detection algorithm introduced in 1977. Moravec’s algorithm operates on detecting the intensity contrast from pixel to pixel within a black and white image. By taking the sum of squared differences (SSD) of the intensity from pixel to pixel, patches with a large SSD denote regions of interest whereas patches with a small SSD denote regions of minimal interest. Harris & Stephens take this concept a step further by taking the differential of the intensity score with respect to direction into account. The general form of this weighted SSD approach can be found in Equation 3, where an image region (퐼) of area (푢, 푣) is shifted by (푥, 푦), a value representing the weighted SSD is denoted by 푆(푥, 푦). 푆(푥, 푦) = ΣΣ푤(푢, 푣)(퐼(푢 + 푥, 푣 + 푦) − 퐼(푢, 푣))2 23 푢 푣 Equation 3 Harris & Stephens method’s implementation is built into MATLAB’s Computer Vision Toolbox and the results are shown in Figure 11. Figure 11 - Harris Feature Extraction
  • 24. Blob Detection Blob detection methods are algorithms that are used in computer vision systems to detect and subsequently describe local features in images. These detection algorithms are very complex ways of detecting points of interest. These points of interest are then filtered for stability and reliability based on their various properties. A robust local feature detector named SURF was first presented by Herbert Bay in 2006. SURF (Speeded-Up Robust Features) is a feature extraction algorithm that was inspired by SIFT (Scale-invariant feature transform). Although SURF is the algorithm that ultimately was utilized, it is important to describe the concept behind SIFT which SIFT uses to extract important features from an image. SIFT is an algorithm published by David Lowe. Lowe's method is based on the principle that distinct features can be found in areas of an image that are located on high contrast regions known as an edge. However, beyond edge detection the SIFT method also accounts for the relative positions of these detected points of interest. The SURF method’s implementation is built into MATLAB’s Computer Vision Toolbox and the results are shown in Figure 12. [10] [11] [12] [13] [9] Figure 12 - SURF Feature Extraction 24
  • 25. Translational (XYZ) Background Self-localization is the act of determining the position and pose of a mobile body relative to a reference map. Self-localization is a very important component in almost every function of an autonomous robot. As previously mentioned, both the position and the pose of a mobile body are important pieces of information to track. For ideal organizational division of labor, this section will detail the methods in which the problem of self-localization with respect to placement is solved. By utilizing the Asus Xtion Pro, measurements of the local environment can be made and used toward solving this problem. However, pure measurements without any distinctive properties do not serve much use when a translation or rotation relative to the map is performed. As such, effort must be taken to withdraw distinctive features of the surrounding environment. A multitude of computer vision systems rely on corner detection or feature extraction algorithms to detect various points of interest that can be associated to the environment rather than just an individual image. Object Detection Either of the aforementioned feature extraction methods can be used to quickly ascertain the similarity of each image based their features. Once the strongest feature points from each image are gathered, they are compared with each other using the following code. Figure 13 - Sample Code The strongest feature points from each image are matched for similarities. The feature points with matching distinct identical features can be plotted. An example of this can be seen in Figure 14. 25
  • 26. Figure 14 - Matched Points (Including Outliers) The next step is to eliminate any erroneous data points from misaligning or skewing of the object by conducting a geometric transformation relating the matched points. The final result is shown below. Figure 15 - Matched Points (Excluding Outliers) Once the erroneous data points have been eliminated, the pixel location of the object in the scene can be approximated and finally data can be taken. By using OpenNI1, the exact pixel location can be used calculate the x, y, and z coordinates of the object. The OpenNI framework utilizes two different coordinate systems to specify depth. The first coordinate system is called Depth. This coordinate system is the native coordinate system as the X, and Y values represent the pixel location relative to the frame of the camera. The Z value in the depth coordinate system represents the depth between the camera plane and the object. Using a few functions, OpenNI allows for the ability to quickly translate the Depth coordinate system to the Real World coordinate system. Here, the x, y, and z values represent the more familiar 3D Cartesian coordinate system. Where the camera lens is the origin, and the x, y, and z values represent the distance in those dimension the object is away from the origin. With the knowledge that these XYZ real world coordinates are from the camera’s location, the relative location of the camera in accordance with the global map can be calculated. If the object’s placement is previously given in the global map; the raw difference in X and Z can be used to calculate the actual location of the camera compared with the approximate location. [10] [11] [13] 26
  • 27. Implementation This method is not infallible by any means; the object matching utilizing a template requires a very high resolution camera. While the Xtion Pro has a decent RGB camera, the effective range of the RGB camera is about one meter. Therefore, when selecting landmarks for localization, care must be taken to select large landmarks with distinctive features that are unique to that specific landmark. For example, the use of a door would prove to be inadequate due to the fact that a door would typically not have distinctive features when compared with another door in the environment. Because of the aforementioned difficulties, the usage of QR Codes alongside the concept of localization “checkpoints” were used to perform positional localization with the degree of accuracy that is desired. Figure 16. Localization "checkpoints" These QR Codes serve two different purposes. By utilizing unique QR Codes at every checkpoint, the difficulty of distinguishing different checkpoints from one another is eliminated. The next goal of using QR Codes is to provide a distinct object from which measurements can be made. The localization file used to obtain a fix on the relative position of the robot with respect to the map is named Appendix. This localization subroutine enables the ability for the robot to obtain an accurate x and y dimension reading with respect to the map. Because the QR codes are strategically placed ninety degrees to the ideal path toward the goal, the x position of the robot can be calculated and corrected simply by measuring the depth of the wall that the QR code is posted on. If an x position of 80cm is desired, the localization subroutine will simply iterate the logic until a depth of 80cm is recorded. The 27
  • 28. next objective of the localization subroutine is to obtain the y position of the robot with respect to the map. Before navigation can be conducted, the system must have a template image associated to that specific QR code and localization checkpoint to use as a reference. Figure 17 - Overlayed images of a template versus current The image above is an example of utilizing the template image to compare the current position of the robot with the ideal position of the robot. The red circles represent feature points detected by the SURF method. These are representative of the feature points that will be tracked. Next, the green plus marks are those same feature points but at a different location. The translation can be calculated directly. This allows for calculation of the y position by comparing the template image and the current image and leveraging the QR code as a point of reference. Templates are created using Appendix Code. 28
  • 29. Bearing Overview The localization subroutine used to determine the pose of the robot has undergone a multitude of changes over the course of the project. The two main methods that emerged from the development phase that became the two primary methods of determining pose. The first method utilized a template matching concept in order to match the exact viewing angle that the template was created within. The next method involved using simple trigonometry as well as the fact that each localization checkpoint directly faces a large flat wall. Flat Wall Assumption The simplest and most consistent method of determining pose involves assuming that every localization checkpoint can be placed within a section of the wall that has a wide and flat surface. Figure 18 - Flat wall assumption The green block represents the wall, the white space represents the open space, the blue grid represents the map grid, the red circle represents the robot, the arrow indicates our current heading/pose, and finally the dotted line represents the plane from which the Xtion Pro takes measurements. Figure 18 assumes that our heading is exactly facing the wall, and the dotted line which represents the measurement plane and the wall are exactly parallel. However, this assumption cannot be immediately made with some initial calculations and measurements. Let the pose scenario for the robot be presented in the figure below. The measurement plane is skewed an unknown angle, and the pose of the robot is now unknown. As such, the correction angle is also unknown. The correction angle can actually easily be solved for with some simple trigonometry. 29
  • 30. Figure 19 - Unknown pose Measurements are taken from the measurement angle for depth directly to the green wall. If the robot’s measurement plane is truly parallel to the wall, both measurements will be equal. Otherwise, the difference between the two measurements can be utilized to calculate the correction angle needed to adjust the pose to the desired state. ΔY Δ푋 푡푎푛−1 ( 30 ) = 휃 Equation 4. Correction Angle The code listed in Appendix uses this exact methodology to calculate for the correction angle if needed. Template Method In order to properly localize itself, the system must have a template image to use as a reference. This reference image should be a photo taken from a position where the robot would be ideally lined up. Whether it is aimed at a specific landmark or just down a hallway, it needs to be an image of its destination orientation. To optimize the likelihood of success of the system, an image with some activity should be used, activity meaning items hung up on walls or doors; something that is not just a plain hallway. Figure 20 below shows a strong example image of a template.
  • 31. Figure 20 - Example of a template captured by the Xtion Pro While technically an image from any camera could be used, given it shoots in the same resolution, ideally the rgb camera on the Xtion should be used because the comparison images will be shot from it. To facilitate the creation of a template image, the script shown in Appendix – QRCodeDetection.m, is used to initialize the camera, snap a picture and then store the outputted image in the folder. This function allows for templates to be created easily, which is beneficial for testing, but as well for if the environment in which the robot is functioning, changes. Template Method - Processing For the system to orient itself, it compares the current image the camera is seeing with the precaptured template image. The script analyzes the two images finds a correlation between the two and then rotates the system respectively. If the script finds no relation between the two pictures, it rotates 58° (the field of view of the camera), to give itself an entirely new image to try. Once it has a relation between the two, the script repeats itself until the system is lined up with the template. Thanks to MATLAB’s Computer Vision Toolbox, the script is fairly straightforward. The script relies heavily on the toolbox’s built in function’s corner point finding abilities. While the toolbox comes with a variety of different feature detection functions including “BRISKFeatures”, “FASTFeatures”, and “SURFFeatures”; the chosen method for this project was “HARRISFeatures” which utilized the Harris- Stephens algorithm. This method was chosen because of its consistent positive performance in testing. The “detectHARRISFeatures()” function takes a 2D grayscale image as an input and outputs a cornerPoints object, ‘I’, which contains information about the feature points detected. Not all of the points found by the Harris-Stephens algorithm however and further processing needs to be done to find viable points. The “extractFeatures()” function is then used and takes an input of the image as well as 31
  • 32. the detected cornerPoints to find valid identification points. The function takes a point (from ‘I’) and its location on the image and examines the pixels around the selected point. The function than outputs both a set of ‘features’ and ‘validPoints’. The ‘features’ object consists of descriptors, or information used that sets the points apart from other points. ‘validPoints’ is an object that houses the locations of those relevant points; it also removes points that are too close to the edge of the images for which proper descriptors cannot be found [14]. Figure 21 below shows how the code is executed whileFigure 22 Figure 22 shows the results of using these functions to identify Harris features. Figure 21 - Code application of the functions Figure 22 - Example of Harris features extracted from an image 32
  • 33. This process is run for both the template image and the comparison image, which brings about the importance of using the same camera for both images. Different cameras, although same resolution, can capture an image in different color, making the feature matching process more difficult. With the two images processed, they must then be compared. Using the built in function, “matchFeatures()”, the two sets of features can be compared to and determined as to whether or not they represent the same feature, but from a different angle. The output consists of an index of points that contain features most likely to correspond between the two input feature sets [15]. Plugging the locations of those specific points back into the validPoints allows for an object consisting solely of the coordinates of matched pairs. Figure 23 shows the applications of the functions in the code while Figure 24 shows the result of comparing the Harris features on the two images. From the image, it is clear to see the translations of the different points from template to comparison, as represented by the line connecting the two different datasets. This translation is crucial to the script because that is the key to determining the amount the robot must rotate to properly line up. Figure 23 - Code application of the functions 33
  • 34. Figure 24 - Result of identifying the Harris features and finding matching features in the images The pixel difference between a specific set of points is relative to the angle the robot must rotate to line up. Because there is usually some discrepancies when it comes to comparing all the different pixel displacements, an average, ‘d”, is taken of the lot to get a better overall displacement. A multiplier is created, using the average displacement over the horizontal resolution as a ratio. That ratio is multiplied by the sensors field of view to get the necessary turn angle, as represented by θ in Equation 5. Figure 25 - Visual representation of the relative pixel displacements used to find the function 34
  • 35. 35 푑 푥푟푒푠 ∗ 퐹푂푉 = 푑 640 ∗ 58 = 휃 Equation 5 The robot is then turned the calculated value of θ degrees to try and line up in the right direction. The entire process is set in a loop to repeat until the robot is as close to lined up as can be. This is made possible by utilizing the average displacement data from between the valid points. The code is set to repeat itself, capturing a new comparison image, finding features and comparing them, until the average difference between points is below a certain number. That chosen number is 20, which relates to an angle of 1.82 degrees. Most tests, with the upper bound set to 20, would result in an average displacement much less than that actually being the result of the rotation. Testing also showed that when the chosen number was lower, the system would try and make it into the bounds and wind up oscillating until it was no longer remotely lined up. The complete code used to perform the comparison can be found in Appendix – CreateTemplate.m at the end of the paper.
  • 36. 36 Obstacle Detection Obstacle detection is extremely important because without it, the system cannot be considered autonomous. It allows the system to differentiate between what is a clear and travelable location and what is obstructed and needs to be avoided, on the fly. Once the obstacles are detected, they are able to be transformed to the map, allowing the robot to move safely and freely. Process Before any scripts can be run, a depth image is first captured of the system’s current field of view. The image is stored as a 3D matrix that holds real world XYZ coordinate data, for each pixel of the camera’s resolution, i.e. a 640 x 480 x 3 matrix. The XYZ data represents the length, height and depth for each pixel, in millimeter units, hence the 3D matrix. Figure 26 below shows a graphical representation of the depth map; where every pixel that is not black, has XYZ data. Figure 27 provides the respective visual image of the depth capture for clarification purposes. It is important to note that the smooth reflective tile floor caused issues with depth collection, shown by the close black pixels. Figure 26 - Mapped Depth Map of a Hallway Figure 27 - Respective Image Capture of Depth Capture
  • 37. Once the depth image is captured, a ground plane extraction can be run on the image. Due to the low resolution of the camera, only the bottom 100 pixels of the image were analyzed; anything above those pixels was too inaccurate and caused issues with extraction. The ground plane extraction runs on the principle that each step taken down a hallway takes you further than where you started. This means that each point on the floor should have a point behind it, further away. To do this, the depth portion (Z) of the depth map is analyzed and each cell is examined. If the cells vertical neighbor is a larger number, meaning further from the camera, it is not considered an obstacle. Any cell that is not considered an obstacle, has its location saved to a new matrix. Plotting that matrix on top of the original image produces Figure 28 below. Figure 28 - Ground Plane Extraction The lack of pixels in certain areas is again due to a combination of having a low resolution camera as well as the material that the floor is made of. However, it is still easy to see in the figure, where the ground is clear; marked by the green pixels. The pixel selection is then inverted, causing the unselected to now become selected and vice versa. This is done in order to now highlight and detect what is an obstacle. In order to try and increase the green pixel density, the data is smoothed using MATLAB’s built in smoothing function. Further manual smoothing is then performed. Any column of the image that only has one pixel is removed, as it doesn’t provide enough data. Each column that has two or more green pixels, is filled with green from the columns minimum value to its highest. This manually removes all the small missing pixel areas. Lastly, if a column of green has no green columns on either side of it, then it is also removed because it is more inaccurate data. With the ground plane extracted, all of the opposite pixels are selected and made into an “obstacle” array. Figure 29 below shows the results of the obstacle extraction. Green means that the area obstructed, while no marking means the path is clear. 37
  • 38. Figure 29 - Obstacle Extraction Each marked pixel’s locations (on the image) are updated in the obstacle array. Because the obstacle array was created to overlay on top of the image, the marked pixel locations are directly related to XYZ data. The real world location for each pixel is then extracted from the depth image, which is then written to the already existing map. Because it is a top down two dimensional map the length values (X) are used as the horizontal distances and the depth values (Z) are used as the vertical distance. All the values are written from the origin, which is wherever in the map the robot currently is. The process of writing values to the map consists of assigning the specific cell, determined from the coordinates, a value of 4, indicating at that location there is an obstacle. Figure 30 below shows a zoomed out matrix image of the map after the obstacles have been added. Figure 30 - Colored Map with Obstacles Because of the zoom, the values of zero are not shown. Anything visible has a value of 1; the black lines are the walls, red are the detected walls and green indicates the trashcan. The origin of the figure, where the points are measured from, is the bottom center of the image. The full code can be found in Appendix – ObstacleDetection.m. 38
  • 39. 39 Obstacle Avoidance Theory Robotic pathing, or motion planning, is the ability of a robot to compute its desired movement throughout the known area by breaking down the movement into smaller steps. The path is calculated such that the robot could complete travel to its end destination in one large motion. Starting position and heading of the robot are assumed to be known from either specific placement of the robot or localization algorithms. A map is imported into the system showing the end destination of the intended motion along with all known obstacles within the map area. A specific pathing algorithm is then chosen, using the map as the basis of the path calculation. Grid-Base Search pathing sees a similar functionality to the occupancy grid in that the map is broken up into grid that is represented as being filled with either free or occupied points. The robot is allowed to move to adjacent grid points as long as the cell has a value of free associated with it. Constant checking from a collision avoidance program is needed for this type of pathing to make sure the grid points along the planned path are truly free. One consideration with this type of pathing is the size of the grid overlaid on the map. An adequate resolution has to be chosen depending on the availability of the area of motion as free. Courser grids with cells representing a larger area offer quicker computational time but sacrifice precision. A cell that is deemed to be free might actually be occupied in 40 percent of the cell area. This could lead to potential collision in narrow sections of the map. The other side of the coin has a fine grid with more precise cell values. More time is needed to complete the path before operation could start. Figure 31 - Example of valid and invalid path from grid pathing [16]
  • 40. Potential Fields method sees the use of pseudo-charges to navigate to the destination. The end destination of the path is given a positive, or opposite, charge that the robot would like to move to. Obstacles along the robots motion toward the goal will assigned negative, or like, charges. The path is then determined as the trajectory vector from the addition of all the charges. The potential field method has advantages that the trajectory is a simple and quick calculation. The problem lies in becoming trapped in the local minima of the potential field and being unable to find a path. Local minima can be planned and adjusted for by a number of methods. Adding small amount of noise into the system will bump out of the minima area and back into useable calculation. Obstacles could also be given a tangential field to again move the robot out of the minima to recalculate the path [17] . Figure 32 - Vector field analysis and path of movement from Potential Field method Execution As seen in the Obstacle Detection section, several subroutines allow for the robot to map the unforeseen obstacles within its path onto the map. These obstacles are represented by numerical 1s. As previously described in the Theory section of Obstacle Avoidance, there are a multitude of techniques can that be utilized to dynamically calculate the path to avoid these obstacles. The method that was chosen for this particular project is called the Potential Field method. Adopting behaviors from nature, obstacles and goals are modeled with pseudo-charges and the contribution of each pseudo-charge is calculated and the resulting velocity vector is used to drive the robot. As described in Michael A. Goodrich’s tutorial for the implementation of potential fields method, the functional behavior of goals and obstacles within the operating environment of the robot are carefully defined with a few parameters. For 2D navigation, (푥푔, 푦푔) define the position of the goal, and (x, 푦) define the position of the robot. The variable r will define the radius of the goal. The direct distance as the angle between the goal and the robot is calculated. By utilizing a set of if statements various 40
  • 41. behaviors can be defined with respect to the operating boundaries of the robot and its interaction with the goal. 41 { Δ푥 = Δ푦 = 0 Δ푥 = 훼(푑 − 푟) cos(휃) , Δ푦 = 훼(푑 − 푟)sin (휃) Δ푥 = 훼 cos(휃) , Δ푦 = 훼 sin(휃) Where 훼 is the scaling factor of the goal seeking behavior, d is the distance between the robot and the goal, r is the radius of the goal, and theta is the angle between the goal and the robot. The first equation is used if the distance between the goal and the robot is less than the radius of the goal. This defines the behavior of the robot when the goal has been reached. The second equation defines the behavior of the robot when the radius of the goal is less than or equal to the distance between the goal of the robot and that is less than the influence field of the goal plus the radius of the goal. This condition is met when the robot is nearing the goal, and the equation will scale the velocity vector that the robot experiences until the goal is reached. And finally, the last equation defines the behavior of the robot when it is nowhere near the goal. The obstacles are defined in the exact opposite manner. Both the distance and the angle between the robot and the obstacle are calculated and used in the following equations. These equations are contained within a set of if statements. These if statements define the behavior of the robot in various scenarios involving these obstacles. { Δ푥 = −(cos(휃))∞, Δ푦 = −(sin(휃))∞ Δ푥 = −훽(푠 + 푟 − 푑) cos(θ) , Δy = −β(s + r − d)sin (θ) Δ푥 = Δ푦 = 0 Where 훽 is the scaling factor of the obstacle avoiding behavior, d is the distance between the robot and the goal, r is the radius of the goal, and theta is the angle between the goal and the robot. The first equation is used if the robot’s position and the obstacle’s position are infinitely small. As such, the robot’s behavior will be to move away from the obstacle at very large velocity vector. Again, the second equation is to scale the velocity vector of the robot as it approaches the obstacle, and finally the robot experiences no contributions to its velocity vector if it is nowhere near the obstacle. The overall velocity vector is a summation of the contributions of the attractive and repulsive forces exerted by the goals and obstacles within the robot’s operating environment. This exact method is utilized can be found in Appendix – PotFieldDrive.m.
  • 42. 42 Elevator Navigation The process behind the elevator navigating component involves a modified and extended template matching system. The logic behind the system is that if it can tell that it’s in front of the panel, it can discern between the different buttons and therefore monitor which are pressed are which are not. The system first scans to capture the current panel and determines whether or not the desired floor button is pressed. If the button is not pressed, it prompts the user to press it and will continue scanning until the button is detected as pressed. The exact detection process will be detailed in the following section. With the button verified as pressed, the system then checks constantly until it sees that the button is no longer pressed. This signifies that the correct floor has been reached and the robot will proceed to exit the elevator. Process The first step in the process is capturing a template image. This process is only done once and can be applied to all of the elevators that use the same panel. This image is captured using the Image Acquisition Toolbox with the webcam selected. Figure 33 below shows how the call is made to the webcam. The color space is set to gray scale to simplify the template matching process as well as the frames per trigger set to one to make it so the program only captures one image every time it’s called. Figure 33 - Image Acquisition The template is created and cropped to feature only the actual panel itself, as shown in Figure 34. This template image is stored and is used for every run; it does not change.
  • 43. Figure 34 - Elevator Panel Template With the template captured, the process is able to run. With the robot in place facing the panel, it begins to capture new images. Each image is matched against the template image and compared to find similar points, using the same method as the localization. However in order to analyze the current image, a geometric transformation is performed. This automatically scales and crops the captured image to match that of the template [18]. Figure 35 shows MATLAB actually matching the different images. Figure 35 - Template Matching with the Captured Image This is done so that the individual buttons can be analyzed. The image is sectioned off into different rectangles, one for each button. This is done by creating a matrix of coordinates consisting of the pixel locations of rectangles that encompass each button. Doing this allows the function to look for a certain flow, and know how to crop the image accordingly. The cropping matrix for this elevator panel is shown in Figure 36 and only works on this template. 43
  • 44. Figure 36 - Cropping Matrix Setting the desired floor to one, so that the system gets off at the lobby, causes the program to crop the image like as shown in Figure 37. Figure 37 - First Floor Crop With the image cropped correctly, the actually button needs to be analyzed to be determined whether or not it is pressed. In order to do this, the image is converted to strictly black and white. The illuminated button creates a large amount of white pixels as shown in Figure 38 , which are nonexistent when the button is not pressed. Figure 38 - Black and White Conversion With the white pixels showing, it is easy to count the amount and use that number to determine whether or not the button is pressed. This is done by using the number of non-zero (nnz) function built into MATLAB as demonstrated in Figure 39. The nnz function analyzes a matrix, in this case the black and white image, and counts the number of elements that are non-zero, in this case white pixels [19]. If that number is large enough, it tells the program that the button is pressed and continues running. Figure 39 - NNZ Function 44
  • 45. This process is nested within itself and after running and determining that the button is pressed, it repeats itself. It uses the same exact process to determine when the button is no longer pressed, signaling that it is time to exit the elevator. The full code can be found in Appendix – ButtonDetection.m. Exit Elevator Part of navigating the elevator involves being able to physically leave. The system is set to drive out using prewritten commands. Originally, depth measurements were going to be used, however other visitors in the elevator threw off the ability to find the walls and therefore the only way to constantly move was with scripted steps. The full code can be found in 45
  • 46. 46 Subject Identification The use of a QR code to determine a visitor’s identity is the current choice for verification. The use of QR code was chosen for multiple reasons. Because it is basically a barcode, QR codes are very simple to generate, especially with the help of different internet sites. Now that most people carry around smartphones that can decode the codes with the use of an app, they are becoming a more and more popular method to share information. Figure 40 below shows an example of a standard code that is used. Figure 40 - Standard QR Code A code, in this case a static id, can be inputted and a high resolution image of the code can quickly be created and emailed to a client. The QR code was also chosen for ease of use, mainly with respect for the client. With more well-known methods, such as facial recognition, a database of client faces would need to be created. Not only would this be extremely tedious, it could make clients who do not want their photograph taken feel awkward. Because it is a barcode, the QR Code can easily be decoded into strings that can be stored in a simple database, much easier than having to sort through a gallery of photographs. The last reason it was chosen was due to the availability of free software that is compatible with MATLAB. The current working code for the QR scanning can be found in the Appendix – QRCodeDetection.m. Process Explanation The system is designed to be as user friendly as possible. Due to the lack of a video display, audio cues are used to guide the visitor through the process. Once the robot reaches the lobby, it begins to scan for a QR code. The visitor, being told prior to the visit of the process, displays the code on their phone (or a piece of paper) in front of the camera. The system then scans and either confirms who they are or rejects the code. If it accepts, it will prompt the user with an audio cue, greeting them by name and asking to follow to the elevator. If it rejects, it will prompt the user to call and schedule an appointment, or to call for assistance.
  • 47. QR Code System The versatility of the QR code allows for unique codes to be easily generated. This project takes advantage of this and allows someone to send out unique codes to all their expected visitors. These unique codes not only allow for personalization for visitors, but also allows for an added layer of security. The same code, while it could scan for anyone, could easily raise a flag when double scanned. The process of setting up the code is very simple in order to make sure that the entire ordeal is no more difficult than sending out a normal email. If it takes forever to set up the entire system, you wind up losing time just like having to have someone go greet the visitor in person, defeating the purpose of the project. First a code is created, whether it be randomly or by design, and assigned to someone’s name, like as shown in Table 2 below. The codes can be normally up to a length of 160 characters, but as shown in Figure 41, the larger the code the more complex the image becomes. While some cameras can accommodate the small image, the RGB camera on the Xtion Pro cannot consistently, so smaller 10 character codes are used. The spreadsheet, which could be hosted on a website, or in a private folder on a service such as Dropbox, or for this proof of concept directly on the laptop, holds these codes and names. Table 2 - Database Example Tyler Aaron 13689482 John Burchmore 2432158 Ian DeOrio 59824 Jeff Gao 209 Figure 41 - (a) 10 character code vs (b) 100 character code 47
  • 48. Using the “xlsread()” function built into MATLAB, the program is able to easily take the spreadsheet and convert it into a useable matrix. Using a “for loop”, the program compares the value from the decoded image, to the matrix from the spreadsheet. If a code matches one found in the matrix, then the system considers the run a success, speaks aloud the person’s name and prompts the user to follow. In order to get the system to be able to read out a visitor’s name, MATLAB must use Microsoft Windows built in Text To Speech libraries. Through the use of a wrapper, written by Siyi Deng © 2009 [20], MATLAB is able to directly access these libraries and use them to read out strings, i.e., the corresponding names from the matrix. Code Decryption Because QR codes are not a closed source technology, there are many different programs and softwares available for people to use when it comes to encoding and decoding them. One of the most popular open source libraries is one developed by ZXing (“Zebra Crossing”). ZXing ("zebra crossing") is an open-source, multi-format 1D/2D barcode image processing library implemented in Java, with ports to other languages [21]. They have android clients, java clients, web clients, but unfortunately, do not have a simple way to incorporate into MATLAB. In order for MATLAB to properly use the ZXing libraries, it needs a wrapper. Fortunately, Lior Shapira created and published an easy to use wrapper for MATLAB [22]. Shapira’s wrapper consists of two simple MATLAB functions, “encode” and “decode”, which call to specific java files to either encode or decode a code. The “encode” function takes a string as an input and uses ZXing’s library to create a figure that displays the resulting QR. However, the figure produced is not the most visually appealing, hence why a different online service is used to generate the code to email to visitors. The “decode” function takes an image as an input, in this case captured with the RGB sensor on the camera, and outputs the resulting message. 48
  • 49. 49 Design Overview Codes and Standards The field of robotics is a new and emerging field and standards are still being put in place. Two very relevant ISO standards are below. “ISO/FDIS 13482: Safety requirements for personal care robots ISO/AWI 18646-1: Performance criteria and related test methods for service robot -- Part 1: Wheeled mobile servant robot” It is important to note that these standards are still in early stages of development and are constantly being updated and modified. Just like the project, they are a work in progress, and new standards and codes will be used as they are instituted.
  • 50. 50 Model Concept Design Figure 42a shows a Creo Parametric 3D model of the planned design for the robot. Figure 42b shows the completed assembly. The goal of the design was simple yet secure. The model features a 1:1 scale iRobot Create1 and Xtion Pro2. The aluminum support bar3 can also be changed to a different size if need be. The assembled robot design changed slightly so the height of the Xtion Pro could be changed if needed. A web camera was also places on top of the aluminum support bar for QR code scanning and elevator detection. Figure 42 – a) 3D Creo Model of Robot System b) Final Proof of Concept Design The current housing, which can be seen in Figure 43, is designed to fit most small sized laptops/netbooks, but can be modified if the laptop to be used changes. The housing is round not only to match more aesthetically with the Create, but also to minimize sharp corners poking out from the system. The open design was chosen because the reduction in materials will keep the weight at a 1 Courtesy of Keith Frankie at http://www.uberthin.com/hosted/create/ 2 Courtesy of GrabCAD user Kai Franke at https://grabcad.com/library/asus-xtion-pro-live 3 Courtesy of 80/20 Inc. at http://www.3dcontentcentral.com
  • 51. minimum as well as because the fabrication of the system will be much easier without having to create a round wall. Each peg is screwed into the top and bottom plates. There are four smaller pegs that screw into the iRobot base and the bottom acrylic piece. The gap between all the pegs is roughly six inches, not enough to remove a laptop. Because of this, the back peg can be easily removed, the laptop slid in and replaced. Figure 43 – a) Creo Model of Housing b) Physical Model of Housing The laptop housing is made of two 0.25” thick clear acrylic pieces with a diameter of about 12”. The top and bottom pieces are essentially the same aside from the opening on the bottom piece for easy access to the iRobot Create’s buttons. The six pegs are made from 2” long threaded aluminum standoffs. This allows the pegs to be screwed on between the acrylic pieces. On the top piece of the acrylic housing the aluminum t-slotted extrusion extends about 36”, which is attached using two 90-degree brackets. On the t-slotted frame another 90-degree bracket is used to attach the Xtion 0.25” thick acrylic platform. This allows for varying the height of the Xtion camera. A close-up of the Xtion and its platform can be seen in Figure 44. The web camera on top of the aluminum extrusion is attached with a secured 90- degree bracket to allow for easy removal. The aluminum extrusion is also used for cord management. 51
  • 52. Figure 44 – a) Close up View of Web Camera b) Close up View of Xtion Pro and platform Between the iRobot Create and the bottom side of the laptop housing are four small nylon standoffs to allow for some spacing for cords and modularity, which can be seen in Figure 45. Figure 45 – a) Creo Model Close up of housing spacers b) Physical Model Close up of Housing Spacers Physics In order to calculate the center of gravity a couple assumptions need to be made. To start, the Asus Xtion Pro is assumed to be a uniform solid piece. In reality, it is not completely solid and its actual center of gravity could be different depending on how the internals are oriented. For ease of calculation, since its total weight and dimensions are known, the volume and in turn density of the camera can be calculated for the “solid uniform” piece. A similar assumption needs to be made for the iRobot Create. Since the exact weights and positions of all the internals of the robot are unknown, the robot is also assumed to be a uniform solid piece. Models of these two parts were created in PTC Creo and put into the assembly in order to do the overall center of gravity calculation. The known densities and calculated mass properties of all the materials used in the model can be seen in Table 3. PTC Creo uses these densities to calculate the volume and mass of each component. After running an analysis on the mass properties of the entire assembly, Creo outputs the x, y, and z coordinates for the center of mass of each component, as well as an overall center of gravity. 52
  • 53. Table 3 - Mass Properties of Model Assembly 53 Part Density (lb/in^3) Mass (lb) Center of Gravity (w/ respect to Create_Robot_Axis) X Y Z iRobot Create 0.0353 8.8000 0.0000 1.0831 0.8225 Xtion Pro 0.0249 0.4784 0.0000 18.6422 0.0000 Aluminum Support 0.0979 1.3553 0.0000 11.4080 0.0002 Xtion Platform 0.0434 0.1049 0.0000 17.5326 0.0000 Laptop Housing 0.0434 2.4994 0.0000 4.1820 -0.0074 Base Spacer 1 0.0434 0.0010 4.3884 2.6580 -0.1022 Base Spacer 2 0.0434 0.0010 4.3884 2.6580 -2.8522 Base Spacer 3 0.0434 0.0010 -4.3884 2.6580 -0.1022 Base Spacer 4 0.0434 0.0010 -4.3884 2.6580 -2.8522 The center of gravity coordinates listed above are with respect to the main axis of the iRobot Create model. That axis is located at (0, 0, 0) directly in the bottom center of the create model as seen in Figure 46. Figure 46 - iRobot Create Coordinate System
  • 54. According to the Creo analysis, the calculated center of gravity for the entire assembly is located at: (X, Y, Z)(0.0000, 3.4898, 0.5448). These numbers do seem to make sense when looking at the model. With the direction of the x-axis being in the absolute center and all components being centered on the model, that x-coordinate would be zero. The y-coordinate is expected to be just slightly above the Create due to the extra weight above, and since the majority of the weight is in the Create itself. The z-coordinate should be slightly biased toward the front of the robot since more weight is located there. Parts List All of the major components and raw materials used in the design of the robot are readily available to consumers. The iRobot Create, Asus Xtion Pro, and Acer Ferrari are the three major components that are physically off the shelf products. The Creative Live Web Camera is also readily available off the shelf. The laptop housing and Xtion support materials required fabricating in the machine shop to the desired specifications. The acrylic pieces were all modeled in AutoCad and laser cut to the required specifications. A complete list of all parts and materials can be seen in Table 4. Table 4 - Final Parts List 54
  • 55. Simulation / Experimental Validation Plan 55 Description of the Test Bed Bearing Localization The Xtion Pro is to take a template photo about 80cm from the wall in the SAS lab; this is the goal orientation of the system. For the template photo, the robot is aligned with the wall such that the Xtion camera plane is parallel with the wall. This original orientation of the robot corresponds with a heading of zero degrees. With the template taken, MATLAB is then used to turn the robot to a random angle in the positive and/or negative direction. Because the field of view of the camera is 58 degrees, that bounds the random number. Using the function shown in the Localization-Bearing Template Method section, the robot then captures a new image to compare to the template image. Feature detection is used to determine any differences from the template and new images. The displacement of matched features is then related to an angle, and the robot will turn that angle back towards the zero degree heading. After the turn has been made, a new comparison image is taken and the comparison process repeated. If the measured pixel displacements between features are small enough, as explained in the validation section of the report, the process will end. If the resulting displacement is too large, the displacement will again be turned into an angle and the robot turned. The process is repeated until the system reaches that zero degree heading; within a certain level of uncertainty. Using the flat wall assumption program described in the Localization-Bearing section of this report, another method of determining the pose of the robot is performed. The robot is set at a distance of 80cm from the wall, with an orientation so that the plane of the Xtion camera is parallel to the wall. This validation will only work efficiently if the robot is facing a wall without any major protrusions. The robot is then commanded to turn a random number of degrees in the positive and/or negative direction. As described earlier, the program takes measurements of two different points, one on either side of the robot. It then uses trigonometry to calculate the angle to turn in order to make both those measurements equal. The program runs in a loop until the measurements are determined to be equal, and the validation is over. The corrected angle is output in MATLAB, which can be compared to the initial angle of offset.