Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Table of Contents: I. Detailed Proposal Information......................................................................................................2 A. Innovative Claims...................................................................................................................2 B. Proposal Summary..................................................................................................................3 C. Research Objectives................................................................................................................4 D. Technical Approach and Evaluation.......................................................................................6 E. Statement of Work.................................................................................................................22 F. Schedule Graphic...................................................................................................................25 G. Teaming and Tasking............................................................................................................26 H. Project Management and Interaction Plan............................................................................27 I. Deliveries Description............................................................................................................28 J. Technology Transition and Technology Transfer Targets Plans...........................................29 K. Personnel and Qualifications................................................................................................30 L. Facilities................................................................................................................................35 M. Cost Summaries....................................................................................................................36 N. Organizational Conflict of Interest Affirmations and Disclosure.........................................38 O. Intellectual Property..............................................................................................................38 P. Human Use............................................................................................................................38 Q. Movie and Slides...................................................................................................................38 1
  2. 2. I. Detailed Proposal Information A. Innovative Claims Mobile wireless networks operating indoors face a bewildering shifting radio environment that must be adapted to in order to provide robust communications. Heretofore network choices have consisted of radio and network parameters. Small robots open the possibility of network nodes that proactively move and sense in the real environment to improve the radio environment. Robot motion expends high energy and so must be used parsimoniously in order to conserve en- ergy for network communication. The fundamental questions become: 1) How do we operate in the real space to optimize in radio space? and 2) How can we exploit networking, mobility, and sensing to maximize network longevity? Machine learning and distributed control techniques of- fer powerful tools to connect the actions, sensing, and goal space. Dynamically redeploying nodes to meet network needs offers a powerful new dimension to optimize network performance. In the LANdroids scenario small autonomous robots operate in unknown indoor environments in order to self configure a network that provides long-lived con- nectivity between mobile end users and a gateway. Our approach has two main elements: 1) combined learning and control strategies to manage network connectivity in new and complex environments; 2) identifying the critical energy tradeoffs at both design and operational stages. The proposal addresses four important innovations in controlled mobility for wireless networks: Network Radio Optimization: The strength of indoor radio links can be increased through relatively small movements. However, the LANdroid must relay and so signals must be opti- mized as a network and not for individual links. Our approach to network radio optimization identifies spatial features, such as hallway corners, that yield robust improvements to network performance. These features will be identified through detailed radio surveys and analysis. Exist- ing radio technologies (e.g. 802.11n MIMO and directional antennas) will be assessed for their system benefits. Model Free Multivariate Extremum Seeking: Control theoretic approaches applied direct- ly in signal space can optimize signals as a network. Our model-free approaches are robust to en- vironmental changes, simple to apply, and provide basic optimization behaviors. Learning Spatial Features and Optimal Control Strategies: Learning provides a means to identify useful spatial features in new environments. We apply our experience in reinforce- ment learning to optimally combine across feature-based and gradient-based optimization. System Energy Management Optimization: Network longevity is a function of energy spent on mobility, sensors, processing, transmission, and network overhead. We apply optimiza- tion over all network resources to exploit these tradeoffs in a unified framework. System design issues such as which sensors are most effective, where to do computing, and implicit sharing of battery energy are addressed. Special attention will be paid to resolving the role of robot vision. The University of Colorado (CU) has had a focused research effort over the past several years to address these problems. Our interdisciplinary team has produced theoretical foundations and practical algorithm implementations outlined in numerous publications that places us in a unique position to implement viable solutions through each of the above innovations. 2
  3. 3. B. Proposal Summary Main goals: The goal of LANdroids is to establish and maintain network connectivity while maximizing network longevity. Our proposal has four technical goals: to identify exploitable fea- tures of the radio environment (Page 8); to develop robust self-forming, self-healing, self-opti- mizing, and energy aware network protocols (Page 9); to learn combined optimization tech- niques that operate in real and signal space (Page 10); and to understand the system architecture that will maximize performance at low cost (Page 16). Tangible benefits: In rich, complex radio environments, the radio structure provides an op- portunity to improve connectivity and increase network longevity. This research will enable small robots to autonomously establish and maintain mesh coverage, extend network reach, dy- namically adapt to network changes, and maximize network longevity (Page 6). Critical technical barriers: Current radio models are insufficient for predicting indoor ra- dio performance. Radio analysis focuses on link connectivity rather than network connectivity. Battery constrained devices can not simply increase power indefinitely to maintain connectivity. Current mesh networks can not exploit controlled node mobility. Research in controlled mobility optimizes in real space and requires expensive localization techniques such as GPS. Simple robot controllers do not work robustly in dynamic or cluttered environments. The LANdroid robot must be low cost. Main elements: We will experimentally evaluate the indoor radio environment to identify exploitable radio and key spatial features (Page 8). We will design gradient-based methods for connecting mobility in real space to network goals (Page 10). Robots will learn to find the key spatial features (Page 12). Learning will optimally combine gradient and feature based control (Page 13). Robot vision will be examined for its overall benefit to the system (Page 14). Robust network coordination, resource management, and routing protocols will enable a system ap- proach to sharing network energy resources (Page 9). System-level analysis will identify the ben- efits of specific architecture and hardware designs (Page 16). Summary of approach: Our network approach to radio analysis will yield better tools for predicting network performance. The control approach will draw on our extensive robotics con- trol experience to combine goals defined in spaces other than real space. Unlike previous work, we select features and control strategies based on successful experience guided by human opera- tors. The system approach to architecture and hardware design will better explore the perfor- mance tradeoffs within the robot cost constraints. Expected results: The results will be four-fold: 1) a better characterization of the ex- ploitable features of the indoor environment including collected measurement data sets; 2) a LANdroid system architecture that defines the most useful sensors, types of antennas, and pro- cessing model; 3) a modular software library that will implement the proposed learning and con- trol algorithms; and 4) communication protocol implementations for the mesh networking and interface between robot hardware and control software. Evaluation plan: The LANdroid program, as designed, has a rigorous performance-based evaluation done four times per year throughout the program. This is a uniquely objective and ap- propriate evaluation scheme in which we look forward to participate. This will be augmented by an internal CU evaluation test bed based on our extensive wireless network test bed and field ex- perience. Cost: Year 1: (12 mo.) $999,749., Year 2: (12 mo.) $999,489., Year 3: (12 mo.) $999,979 3
  4. 4. C. Research Objectives C.1 Problem Description: The research objectives of this proposal are 1) to identify the system architecture, hardware, and strategies which can best exploit the indoor and urban radio environment and 2) to develop a robot learning and control framework that meets the network goals of connectivity and longevity in a complex dynamic environment. C.2 Research Goals Radio Environment Characterization: It is not well understood what radio optimizations are possible in a dynamic cluttered indoor environment. The LANdroid antennas are electro- magnetically shorted by the close proximity to the ground and inefficient. Small movements may be able to improve signal strength but these may be short-lived, hard to track, and better captured by other means such as diversity or MIMO. A key insight is that the primary purpose of the LANdroid is to relay. It is not enough for the signal strength to be better for a single LANdroid neighbor. Therefore, the LANdroid must make a joint optimization of the signal strengths of all its neighbors. We will perform a measurement survey in an indoor environment to evaluate the nature of signals near the ground and the stability of small scale improvements. We will identify and characterize the spatial features which are likely points of good signal. We will study the roll of different antenna types in exploiting the radio environment for the benefit of the LANdroid system. LANdroid System Hardware: The LANdroid System consists of the gateways, LANdroid robots, and edge nodes. The radios on the war fighter and the gateway have unique capabilities that can be exploited for improved system performance. For instance, the gateway is expected to be vehicle mounted and well powered, while the war fighter can add intelligence to LANdroid drop decisions and provide the necessary mobility. These radios are generally better positioned at higher elevation than the LANdroid on the ground. Even among LANdroids, some may have bet- ter positions or more information about the system state. The different node types can have a va- riety of radio, sensor, or processor hardware which can improve its performance. We will per- form system analysis to assess the cost, performance, and energy tradeoffs for these components within the LANdroid system. Special attention will be paid to the role of antennas and video cameras since they may be critical to good performance. This analysis will provide guidance to the LANdroid robot development. System Energy Model: The control software requires a detailed power model in order to make rational decisions between different actions. The model needs to consider both power con- sumption and stored energy of individual nodes and the system as a whole. This model will be the basis of analysis, planning, and optimization described below. This model will also enable specific analysis of hardware components and their role in the LANdroid system. Decentralized, Model-Free, Gradient-based, Motion Optimization in Signal Space: The robot needs basic behaviors to connect actions in real space to network goals in signal space. Control theoretic approaches applied directly in signal space can optimize signals as a network. 4
  5. 5. In a complex environment fewer a priori assumptions are generally better. Our model-free ap- proaches are robust to environmental changes, simple to apply, and provide basic optimization behaviors. The control techniques will be modified to operate directly on performance gradients since, ultimately, the network cares about performance measures such as throughput and latency rather than signal strength. Learning Spatial Features: To take advantage of spatial features (such as hallway corners), the robot must be able to efficiently identify them in the environment. However, these features are generally complex concepts and can appear in many forms. Human operators will present de- sirable spatial features to the robot so that it can learn to identify these features. Offline learning will be performed in the laboratory, while online learning will be used in the field to adapt the representation of the features to the local environment. Initially this identification will be on a single robot basis, but distributed methods will be developed to enable enhanced identification at lower energy costs. Learning Optimal Control Strategies: The robot has a large number of options it can per- form at any given time ranging from sensing, communicating, to moving. The CU team has al- ready developed successful learning-based control strategies for the DARPA LAGR project. These will be adapted to the LANdroids environment. Initially the learning will be guided by a human operator. The goal will be to optimally combine gradient and spatial feature based control techniques. Vision-Based Cost Estimation (VCE): Video is potentially very useful in the LANdroid system. It will enable more energy efficient path planning since it can avoid obstacles and dead ends. Video can identify spatial features at a distance without movement. Video provides a more stable reference frame. However high frame rate and resolution video is expensive in power. Therefore we will modify existing algorithms that can use reduced capability video. Video will also be used to enhance network situational awareness whereby robots can visually estimate rela- tive positions and track future paths by watching other robots or the warfighter. Network Protocols: Ad hoc networks enable the LANdroid system to provide extended connectivity. While many ad hoc protocols exist, none of them explicitly include controlled mo- bility as a network primitive. Further, the routing must consider other network resources such as energy at each node. The network must also efficiently and robustly enable the cooperation and sharing of network resources. A key problem will be protocols for finding other nodes when the network is disconnected. The research will initially focus on connectivity as the primary network goal. However, other quality of service measures will also be considered. C.3 Expected Impact This proposed project will improve our insights into how to tie radio space goals tightly with robot mobility. If successful, it will solve open problems in how to use mobility to enable net- work self configuration, dynamic tethering, intelligent relaying around obstacles, self healing and self optimization while extending network longevity. We will deliver a set of modular software libraries which implement the technology described above. These libraries will constitute the first integrated controlled mobility wireless network implementation. 5
  6. 6. D. Technical Approach and Evaluation Our approach is based on optimizing the overall LANdroid system. The LANdroid system consists of three major components which each contribute to the overall goals of network con- nectivity and longevity as shown below. Gateway LANdroid Edge node (Warfighter) Comm end point and relay Comm relay Comm end point and relay Permanent asset Reusable/disposable Permanent asset Cost $10k Cost $0.1k Cost $1k Features Energy rich Energy constrained Energy constrained Computing rich Computing constrained Computing constrained Mobility exogenous to comm Comm controls mobility Mobility exogenous to comm Can have multiple radios Can have multiple radios Can have multiple radios Can have multiple antennas Can have multiple antennas Can have multiple antennas • Participate in LANdroid sys- • Participate in LANdroid sys- • Participate in LANdroid sys- tem protocols tem protocols tem protocols Roles • Bridge to other networks • Can self-sacrifice for connec- • Initialize LANdroid • Support network functions tivity and network longevity The main evaluation tasks center on self-configuration, tethering, intelligent relaying, self- healing, self-optimization, and intelligent power management. We describe at a high-level our vision for each of these and provide details in the following section. Self-configuration: There are three separate problems here. The first is given a set of con- nected nodes, how best to connect edge nodes to the gateway. This is the basic problem of ad hoc networking. Standard MANET protocols will form the basis of our solution. However, these pro- tocols will be augmented to include power, mobility, learning, and sensing resource management as noted below. The second problem is given a disconnected set of nodes, how do they find each other to form a connected network. We will investigate a number of strategies such as high-power bea- cons and coding that will extend range and help nodes find the presence of other nodes as well as systematic search algorithms. Note that the gateway can support higher power radios, and more sophisticated antennas that can potentially provide deeper (but only one way) signal penetration. In addition to these radio solutions, the LANdroid will search and reason over likely physical lo- cations that will be better connectivity locations (e.g. hallway intersections or doorways). The third problem is how to signal to the warfighter when to drop a LANdroid. Here a coop- erative solution will form between the warfighter radio and the carried LANdroid. The warfight- er radio will monitor its network connectivity to the gateway. As the connection degrades, an es- timate will be made of the path back to recent points of good connectivity. When the signal has degraded sufficiently, the LANdroid will signal to be dropped. The dropped LANdroid is closer to the ground and has worse signal and may be disconnected. It proceeds back toward the last good connectivity point using the protocol for disconnected nodes but with a prior distribution on the best search path. Note that the drop decision is not simply a signal strength threshold. A teth- ered LANdroid following a warfighter may signal it is turning a corner and about to solve the weak link, removing the need to drop. Or, a warfighter in a room to room search may have a pat- tern of fluctuating signals that warrants weighing signals over a longer period. 6
  7. 7. Tethering: Tethering encompasses all of the network morphing and stretching needed to provide connectivity to edge nodes. We envision tethering that is proactive and reactive. In reac- tive tethering, when an edge node signal starts to degrade, the network might first choose to in- crease transmitter power. If the signal continues to degrade, LANdroids will perform a gradient- based distributed optimization to improve the edge node connectivity. Proactive tethering considers the problem of multiple LANdroids in close proximity. Based on traffic, remaining energy, and other factors, the LANdroids will spread out in order to in- crease the physical area encompassed by the connected network. The spreading will be guided by signal gradients and the physical environment. This will help to connect disconnected nodes and to avoid network disruptions as warfighters move through an area. Intelligent relaying: Intelligent relaying requires decisions on how traffic is routed and how the LANdroid relays choose to position themselves. Traffic routing will use energy aware algo- rithms that can load balance rather than use simple shortest path. Since LANdroid movement is relatively expensive in terms of battery energy compared to computing, communication, and sensing; the LANdroid must be parsimonious in its movement choices and consider carefully where it stops. We will use learning mechanisms to identify key spatial features such as hallway corners which are likely to serve as good relay points now and in the future. The same mecha- nisms will be used to make predictions whether the expected energy cost of a move now will likely be rewarded with less future energy cost or better connectivity. Self-healing: The network must adapt to changes that are external or internal to the network. Internal changes include new sources of traffic that cause congestion; or network elements that fail. External changes include new sources of interference or changes to the environment such as a door that closes. These problems are addressed first by adaptive networking protocols to route around points of congestion, interference, or failed nodes. Then learning techniques identify when such changes warrant repositioning of the network. The actual movement can be guided by gradient techniques. Some changes are predictable (e.g. a node running out of power) and if needed the network can proactively respond to the upcoming change. Self-optimization: The LANdroid network operates in an environment that varies over space and time. The problem is determining which optimizations over space are stable enough over time to warrant spending energy to seek out. Edge nodes are mobile and the signal strengths are highly dynamic. However, many LANdroid nodes will be stationary and so become stable points to optimize around. Cooperative protocols will share information about the environment. For instance, reinforced concrete has high penetration loss, while wood frame construction has significantly less. Nodes that can cooperatively identify the construction can adjust their gradient and node placement algorithms. In reinforced concrete positioning at doorways and corners is critical while in wood frame distributing more uniformly is optimal. Antenna technologies may provide significant gains in this environment. For instance, a high-gain antenna can reduce multi- path and increase effective signal strength. We will survey the current state of available antenna types from a system perspective to understand which are most useful for the LANdroids sce- nario. Intelligent power management: The LANdroid system has many opportunities to trade be- tween sensing, communication, processing, and movement in order to conserve network energy. We will study these tradeoffs within the cost constraints of the gateway, LANdroid, and edge nodes. For instance, our initial hypothesis is that a video camera will be a key energy saving component. It will enable more energy efficient path planning since it can avoid obstacles and dead ends. Video can identify spatial features at a distance without movement. Video provides a 7
  8. 8. more stable reference frame. For instance, if a node suddenly becomes isolated, it can use video references of past locations to backtrack even when no radio signal is present or after dead reck- oning has been lost because it was kicked. These benefits must be weighed against video’s cost and energy drain. Similarly, other radio, antenna, sensor, and processing tradeoffs will be stud- ied. Energy will be managed as a system. LANdroids can altruistically spend power on commu- nication, movement, processing, and sensing in order to preserve critical nodes’ energy. Routing can avoid using critical nodes. Nodes can move to make the critical node’s communication easi- er. Computing can be offloaded to other nodes or sent to the processing and energy rich gateway. D.1 Technical Approach The technical details outlined below reflect our experiences in radio frequency environ- ments, wireless network implementation, controlled mobility in networking, distributed coopera- tive control, and robotic navigation. These technical approaches represent years of research, im- plementation, and testing in real environments by the PIs. As such, our proposed solution does not represent fundamental research, but rather the application of sound and tested approaches to the LANdroid system. Radio Environment Characterization: It is not well understood what radio optimizations are possible in a dynamic cluttered indoor environment. Radio signals are well known to vary by tens of dB over both small (~ one wavelength) and large distances. Small movements may be able to improve signal strength but these may be short-lived, hard to track, and better captured by other means such as diversity or MIMO. Large scale movements are known to provide signifi- cant improvements. Some of these may be more critical to network connectivity (e.g. a relay point around a corner) while others may be more ephemeral (e.g. a better location within a room). This characterization is needed for reliable planning and optimization. A key insight is that the primary purpose of the LANdroid is to relay. It is not enough for the signal strength to be better for a single LANdroid neighbor. Therefore, the LANdroid must make a joint optimization of the signal strengths of all its neighbors. There are a number of challenges for a small robot located close to ground. At 2.45GHz, the wavelength is large, and the robot is electrically small as well as electrically close to the perfect or imperfect ground, resulting in a low-gain antenna with possibly low impedance. Unconven- tional antennas will need to be investigated from a system standpoint, along with ways to miti- gate heavy multipath effects. An example is a combination of four corner-cube loaded monopole antennas (used often in the millimeter-wave region due to their simplicity, e.g, [Gro89]), with four beams of the radiation pattern pointing roughly 40 deg from the horizontal plane, and with a local common ground which makes the antenna relatively insensitive to the surface properties. The four elements can enable spatial and polarization diversity and a combined monopole-loop or frame antenna feeding the corner-cube reflector can enable in addition field diversity [Jak94]. A relevant figure of merit is the level of independence of the different diversity levels, which will affect the diversity combining, as shown, e.g. in [PoP02, ZRP05]. Several such antennas which satisfy the robot space constraints will be investigated in terms of the influence of their combined radiation patterns on the network system optimization. To characterize the environment we will rely on both empirical measurements combined with analysis. We will place dense grids of radio nodes in a space and simultaneously measure variations in the environment between different node pairs over time to capture the network val- ue of different locations and their temporal stability. This data will feed measurement-based sim- 8
  9. 9. ulations of network performance. One challenge with small robots 100 Both at 1 meter is that they are close to the ground. One on the floor The figure at the right shows that ra- Both on the floor Throughput (Mbps) dios at one meter above the floor have C o C o significantly better reach and can 10 r r n n maintain greater than 10Mbps around e e r r one hallway corner and greater than 1 Mbps around two hallway corners. 1Mbps However, placing either or both on the 1 1 2 3 4 5 6 7 floor causes the rate to drop below 1 Location Mbps at the first corner and communi- Figure 1: Throughput vs. locations along a hallway cation ceased around the second cor- between an 802.11n MIMO AP and laptop. ner. One goal will be to understand whether simple modifications to antennas will improve this situation. However, a more critical question is the role of the antenna in the larger network optimization problem: • Multiple directional antennas pointing in different directions indicate the angle of arrival. This aids navigation and gradient following. In the indoor multipath envi- ronment, following the strongest signal may lead to dead ends as shown at right. The second strongest direction can indicate alterna- tive paths that would assist navigation. • Directional antennas improve connectivity without expending more power. They reduce interference and increase SNR. Potentially avoiding the need to move the robot. Tethered LANdroid • Small-scale radio effects can be explored without movement. Multi- follows signal gradient into closet ple antennas provide diversity against fading and react at electronic switching speeds compared to robot motion speeds. We emphasize that the goal of this work is to ensure that system considerations are incorpo- rated into the antenna design. The antenna technologies discussed here are all COTS and are not in themselves a focus of the research. Along the same lines we consider other radio enhancements that can be leveraged to save power. Performance can be improved by changing channels. Different channels will observe dif- ferent multipath fading and can avoid interference and jamming. Simple RF measurement de- vices can be built into the robot that allow it to efficiently survey the spectrum and find channels that have less noise, interference, or jamming. Energy Aware Ad Hoc Routing: The energy cost to deliver a packet across a network de- pends on the route the packet follows and the power it is transmitted along the way. For typical IEEE 802.11 interfaces, the dissipated energy is not a strong function of transmit power and is in fact dominated by the majority of the time that the interface is idle awaiting reception. This sug- gests two directions for research. First, saving energy in the transmitter will best be achieved through shutting down the interface as often as possible. We will explore simple protocols such as shutting down interfaces when the channel is idle (say for 10msec) but waking up at synchro- nized periodic intervals (say every 100msec) to send, relay, and receive traffic as long as there is network activity. This will enable the delay target of less than 500msec to be met while provid- ing a significant potential for energy savings. Second, we observe that the transmit power can be greatly increased at little energy cost. 9
  10. 10. This added power can facilitate range and connectivity. In our indoor experiments (using the same setup as in Figure 1) we compared many recent 802.11 COTS antenna technologies (e.g. MIMO, beam steering, etc.). The longest range was always an ordinary 802.11b card combined with a 1 W amplifier. In other words, raw transmit power is a useful dimension to explore. The goal here is to create a radio with variable transmit powers that might be able to go in steps from a few mW to up to more than 1W. The high end would be useful initially since an unattached node could beacon at high power to find other nodes. This is orders of magnitude more efficient than the robot trying to search through movement to find other nodes. For instance sending one 3W beacon packet every second would require less than 10mW power on average and is likely to find other nodes in seconds. Mobile searches would require much higher power to drive the motors and take much longer. High power transmission interferes over a large area, overloads the front end of nearby nodes, and quickly drains the battery. Thus the transmitters should send at lower power whenever possible. We have designed ad hoc routing protocols that include mechanisms for nodes to estimate the minimum power needed to close a link that we will apply to the LANdroid scenario [DBB02]. Maximizing network lifetime can be facilitated through network wide routing decisions. Simply using transmitter power as a metric and choosing minimum power paths does not maximize network lifetime. Traffic can get funneled through nodes with the best connectivity causing their batteries to quickly drain. We use a concept of maximum flow life curve [BrG01,BGZ01] to balance energy drains and effectively treat battery energy as a network resource. The figure at the right shows the traf- fic flow carried in a network over time (averaged over 100 network instances). The maximum flow life curve approach (MFLC) is able to increase network longevity by 50% (at 90% remain- ing flow) compared to minimizing the power cost of each route (MC). Decentralized, Model-Free, Gradient-based, Motion Optimization in Signal Space: De- centralized model-free, gradient-based, motion optimization in signal space will be implemented using modifications of multivariable extremum seeking algorithms developed by the PIs for lin- ear communication networks of unmanned aircraft [DiF07]. This approach brings a number of features that are ideally suited for LANDroid mobility control including: • Decentralization: Decentralized control schemes are characterized by local decision making in which a given agent selects its action based only on information it has gathered from its own sensors and data shared by its “neighbors”. The agent has no knowledge of the global state of the network or explicit global goals. However, through these local interactions group behavior emerges that achieves desirable global performance objectives. The main advantages of de- centralized control schemes are their scalability and robustness to node or network failures. • Model-Free: Multivariable extremum seeking (MES) [ArK02] controllers are adaptive, model free controllers designed to drive the set point of a dynamic system to an optimal, but unpre- dictable location defined by a performance function that is only known to have an extremum. Thus, the system can adapt to robot mobility limitations and the radio propagation environ- ment without explicitly modeling them and the mapping from signal space to physical space is implicitly considered. • Gradient-Based: In order to find an extremum point in the unmodeled system, MES algo- rithms follow gradients in order to improve performance. This local control approach elimi- 10
  11. 11. nates the need to search large regions of the envi- ronment, reducing costly power loss due to mobility. • Spatially Distributed: Net- work coverage tasks such as self-configuration, self- optimization, self-healing, and tethering can all be achieved through local Figure 2. a.) Self-configuration of a single LANDroid between source and destination. b)Self-optimization of LANDroid chain. control only. That is, local interaction rules can be designed that lead to optimal global behavior. This ability results from the spatially distributed structure of the problem in which gradients of the global objective can be determined locally as functions of the state of an agent and its neighbors only (i.e with no global information). Multivariable extremum seeking (MES) [ArK02] controllers are adaptive, model free con- trollers designed to drive the set point of a dynamic system to an optimal, but unpredictable loca- tion defined by a performance function that is only known to have an extremum point. The MES algorithm developed by the PIs [DiF07] differs from standard MES algorithms in that the re- quired external dither signal is provided by periodic motion of the robot about some center point and we add an external ‘virtual plant’. The MES approach developed by the PIs is a variation of the algorithm given in [ArK02] and therefore stability and performance results can be taken and applied. Recent work by [KZA07] has extended the ES framework to nonlinear models that can capture the guidance level behavior of the LANDroid robots, reduce the required excitation (i.e. mobility), and eliminate the need for any positioning information. A key MES framework strength is that a model of the environment and dynamical system is not needed. Thus, the approach can be applied directly in signal space in order to optimize the re- ceived signal strength. The PIs have developed this framework specifically to optimize capacity in linear communication networks of unmanned aircraft using received signal strength only [DiF07]. Figure 2 shows simulation results using the MES framework to self-optimize a linear relay network. Since the MES control law is adaptive and model free, the self-configuration, self-healing, self-optimization, and tethering task behaviors occur as needed. In fact, the network has no explicit knowledge of which of these tasks is being performed. The decentralized control laws continually seek to improve the network in response to the (unmodeled) environment. Figure 3 shows example data collected using a similar approach indoors. The MES approach developed by the PIs can be applied to any gradient-based decentralized control scheme for which a local function can be measured that has the same (local) gradient as the global objective. This includes the large body of work currently devoted to the synthesis of simple interaction rules that result in desired group-wide, Figure 3. Signal strength versus time using global behaviors such as distributed macrosen- the gradient ascent approach indoors. 11
  12. 12. sors, coverage control, and robot swarming. For example, robot swarming algorithms are based on potential energy functions of relative range and the gradient of the potential leads to velocity control inputs. This gradient information is not available when the robots only have relative range sensors (e.g. can only measure signal strength). The MES framework estimates the gradi- ent information while ascending (or descending) it. In addition to applying the MES framework to existing coverage control and robot swarming algorithms, we will develop new variations that specifically address the LANDroid environment. For example, swarming and coverage control algorithms are designed to react instantaneously to changes in the network. This process uses considerable power as transient responses must settle out of the network. In the LANDroid scenario we need to consider the nature of the warfighter’s decision-making and movement processes. For example, a warfighter may briefly explore a new room before proceeding onward. A LANDroid network tethered to the warfighter should not en- ter that room only to vacate it moments later. Thus, adding nonlinear elements such as hysteresis to the virtual force fields that drive the LANDroid will improve the overall performance (e.g. power usage) of the system. Finally, the model-free, gradient-based approach complements the feature-based approach to motion planning. In particular, the MES approach only finds local extrema and can get caught behind obstacles or dead-zones in the radio propagation environment. Key research questions pursued during this project will be the appropriate balance between the two approaches and de- veloping the ability to recognize when a switch from one method to the other must occur. Learning: Machine Learning and Statistical algorithms will play two roles in the proposed work. First, they will be used to learn Spatial Features, online as the LANdroids are deployed, and offline during test deployments designed to emulate a real deployment. The online learning is required because no constructed test environment can account for all environment types an ac- tual deployment of LANdroids will encounter. Second, they will be used to learn optimal control strategies in sensor space, Spatial Feature Space and Signal Gradient Space. Both of these roles of Machine Learning have their foundation in actual deployments under the DARPA LAGR pro- gram and the NSF “Human-to-Robot Skill Transfer” grant. Learning Spatial Features: The Colorado Team has a significant history of using sensor data to learn such concepts as traversibility and non-traversibility in unstructured outdoor envi- ronments [GMO07, PMG07]. We propose to use these same techniques to learn relevant Spatial Features about the environment. These techniques are density based classifiers that have the fol- lowing properties: 1) No assumption is made on the number of classes (Spatial Features) that will need to be learned for successful wireless communication; 2) Learning data only becomes available in small subsets (i.e. the unrealistic assumption that all necessary learning data is avail- able at once is not made); 3) The features used for each Spatial Class may differ, and a formal framework for feature selection is used [Str06]; finally, 4) The learned models can predict when they are applicable to a particular LANdroid deployment, and therefore should be used. This im- plies that the LANdroid robot will know when it doesn’t know, and can be directed to appropri- ately act to learn what is necessary. Learning these models involves the use of dot products, Sin- gular Valued Decomposition (in low dimensional state space), and histogram building, while the application of these models requires passing the results of dot products through histograms. These operations are computationally efficient, allowing online learning with limited CPU pow- er, making the learning framework ideal for the LANdroid project. For offline learning, the mapping of sensor readings to Spatial Features will be learned as follows. Examples of such Spatial Features as doorways, corners and walls, will be “shown” to 12
  13. 13. the robot by placing it near them. A classifier (of the type discussed above) will then be built for each Spatial Feature. This learning will take place in environments that mimic those that the LANdroids will eventually be deployed in. Note that the Spatial Features that will actually be useful, and therefore be learned, is an open question that this proposal is intended to address. In essence we will learn only about the Spatial Features that help the robot optimize its ability to find optimal locations for wireless transmission (see Learning Optimal Control Strategies dis- cussed below). For online learning during an actual LANdroids deployment, the Spatial Features will be learned as necessary. For example, if a soldier runs through something that the robot thinks is a wall, the robot can take a sensor reading of the area, and classify it as a doorway. Thus it learns the concept of doorways in the current deployment environment. Similarly, if the robot runs into a wall (or corner) when its models “believe” the path is clear, the robot can backup and build a classifier of the wall (or corner), allowing it to better optimize paths towards areas of better wire- less conditions, using less battery life. Once again, the concept of wall or corner can be learned with respect to the current environment. Similarly, other relevant Spatial Feature concepts can be learned during a deployment. Finally, the Spatial Features models are small (about 1 KB each), allowing them to be shared by all LANdroids during a deployment, with minimal load to wireless communication. Learning Optimal Control Strategies: Two types of learning paradigms will be explored to learn optimal control strategies from Sensor Space, Spatial Feature Space and Signal Gradient Space readings. The first is a Reinforcement Learning (based on the Markov Decision Process framework) approach that members of the Colorado Team have developed in the past [GKU03, GrU01a, GrU01b, GrU00, GrU04]. The second is a new framework for learning fast, intelligent motion planning using available sensors [ORG07]. This second framework, referred to as Cost Function Learning, combines domain specific learning of cost functions from available data with fast A* search [RBB07]. The Reinforcement Learning approach involves probabilistic reasoning on whether the robot should follow the Signal Gradient, choose paths based on Spatial Features, or combine both of these inputs. We will develop a set of standard behaviors such as “go towards wall”, “go towards room center”, “find room corner”, “follow soldier”, “move in the direction the soldier came from”, “follow signal gradient”, “backtrack current motion”, “randomized motion”, as well as combinations of these behaviors. A policy gradient Reinforcement Learning framework will be used to switch between these behaviors based on sensor observations [GrU04]. The key to this approach is that learning is both fast and efficient, requiring relatively few test deployments to achieve locally optimal policies. In addition, we will investigate the possibility of making the state space sufficiently compact, allowing us to investigate the use of standard Value Function Reinforcement Learning and Dynamic Programming solutions [SuB98]. These algorithms all re- quire a reinforcement signal from the environment to learn optimal behavior transitions policies, the choice of which can greatly influence the quality of the final control policies learned. We propose to investigate a variety of different reinforcement signals [GrU01b], including combina- tions of battery life length and average wireless signal strength. The Cost Function Learning approach involves learning to combine all sensor information into a single cost map that, when A* is used, produce optimal behavior with respect to maintain- ing wireless signal strength and prolonging battery life. The cost function mappings will be learned in test settings by having a human operator demonstrate “optimal” control strategies [RBB07]. These strategies will be determined by experimentation, having the human operator 13
  14. 14. move the robot in various ways and only learn the cost function mappings that produce the best results. In this framework, the human becomes key in influencing which types of Sensor Space, Spatial Feature Space and Signal Gradient Space readings are relevant, not by choosing them di- rectly, but by executing robot actions that are near optimal and allowing a learning algorithm to determine how to combine these readings into a cost function that will achieve near optimal robot behavior autonomously. Vision-based Cost Estimation: We propose adding one or more camera devices to the robot platform to facilitate safe and efficient navigation through the environment. We argue that for the proposed robot configuration, sensing and associated calculation has a much lower draw on battery resources than robot motion. Using Computer Vision to identify the constraints of the physical environment and restrict planned motions to those which are safe and have high poten- tial payoff make the cost of added devices worthwhile. Sophisticated camera modules designed for the cell phone market provide high sensitivity and resolution and operate at 150-250mW while capturing 30 fps. Since we expect to work at relatively low image resolutions, cameras which satisfy our requirements are available for $5-10 per module. In the simplest case adding one camera will allow the robot to choose motions which avoid obstacles and allow identification of environment features (doorways, corridor junctions) which potentially improve the LANdroid’s ability to relay signals. More cameras provide views in mul- tiple directions which could allow the robot to track the warfighter as s/he moves on after drop, as well as analyzing more of the environment without the need to rotate the robot. Depending on the mounting configuration multiple cameras allow us to reconstruct the shape of the environ- ment without any robot motion, using sparse or dense features. As part of our work we plan to explore the space of camera configurations and Vision-based measurements to identify the costs and benefits of each. Essentially we want to identify which measurements of the space will best allow us to achieve the LANdroid objectives of maintaining coverage and preserving battery life. Cameras will be mounted on the robot platform in a variety of configurations for evaluation of cost/performance tradeoffs. There will always be at least one forward facing camera, addition- al cameras may form stereo pairs or panoramic rigs. In all cases cameras will be strongly cali- brated providing both intrinsic parameters (focal length, principal point, skew) and the extrinsic transformation from camera to robot coordinate frame. In general the goal for a LANdroid is to find its optimal pose with respect to signal strength and then carry on with the real mission of providing reliable communications. For the most part we expect the sensory processing to be in a sleep mode. The question then is when do we need to activate the vision system? Naturally at drop the LANdroid must seek out the local optimum for signal coverage, and if a neighboring node in the mesh is added or fails the robot must adjust to the changes in the signal profile. Any time robot motion may be required we anticipate sensing first. Another possibility is for the vision system to provide some level of situation awareness, by periodically capturing frames to identify changes or motion in the environment. Although not specifically part of the LANdroid mandate, the availability of distributed cameras with commu- nication capacity could provide information about other warfighters in the area, fire, falling de- bris or other hazards. The first task of the vision system is to provide information about the immediate environ- ment to guide robot motion. Work on localization for indoor navigation has frequently exploited depth calculation through stereo [MaS87,MuL00] or structure and motion calculations [BoP95] to identify free space and obstacles for mapping and navigation. For our evaluations, single cam- eras or panoramic configurations with limited overlap allow computing sparse or dense scene 14
  15. 15. structure through tracking scene motion as the robot moves. Stereo configurations can compute sparse or dense depth information through correspondence and triangulation, before and during robot motion. The purpose of reconstructing the shape of the environment is to provide informa- tion about the hard physical constraints present, so we can combine them with signal strength in- formation to determine the optimal feasible pose for the robot. We hypothesize several vision-based calculations which will provide information on robot poses which optimize signal strength thus reducing the search space. The first is to track the war- fghter after robot drop. Since we are propagating the signal to the warfighter, the direction of de- parture provides information about the probable location of the next LANdroid in the mesh. An- other possibility is that certain features in the environment can be exploited with the expectation of improving mesh coverage. First we want to locate doorways and corridor junctions along the trajectory of the warfighter where positioning the robot may improve transmission. Another po- tential improvement in signal strength can be obtained by moving to an interior corner. We assume that the robot rapidly achieves a ready state when dropped. After drop we esti- mate gross heading of the warfighter in order to approximate the direction of next node/LAN- droid in the chain. Observing the warfighter turn a corner or pass through a doorway gives infor- mation about the space as well as potential information for optimizing pose. Extensive work on identifying and tracking people exists in the literature [SBF03, LuT01], particularly in the con- text of identifying pedestrians for automated driving [Lom01]. In the simplest scenario differ- encing techniques allow us to identify changing regions as the warfighter moves, and simple blob tracking should suffice to approximate his gross heading. Reasoning about where and when the person tracker fails will help identify important doorways and turns in corridors. Historically many indoor navigation systems have used doorways and corridor geometry to localize and map the environment [DeK02, KoP95, KrB88]. Typically these systems use edge- based representations where particular configurations of linked edge segments are identified as doorways based on viewing constraints. We will provide edge structures as well as color and texture information from the hypothesized region to learn a door detector. Identifying junctions in corridors is somewhat less well defined, but we will use the same process for learning a junc- tion detector. Identifying internal corners where walls meet requires examining dense depth information and determining plane intersections consistent with wall intersections. The structured nature of indoor environments allows us to also test a sparse edge-based approach which exploits classic blocks world edge configuration constraints [Rob65] along with edge-based structure and motion calculations to hypothesize corners. Again we pass our image-based evidence to the Machine Learning framework to generate a more robust room corner detector. Finally there are a number of techniques that could be used to help determine the relative position of LANdroids in the mesh. Cameras could be used to visually identify signals from oth- er robots, for example an LED flashing, assuming the robots have lines of sight between them. If the robots share a common visual space they can locate and communicate common visual land- marks and compute their position based on the relative viewpoint, for example by exchanging SIFT features [SLL02] and computing relative pose based on triangulation. Another space for evaluating cost tradeoffs for vision is the computational load based on the resolution and complexity of detection and extraction calculations performed. In general we an- ticipate using low resolution (120x160 pixel) images to reduce computational load. Another pos- sibility is to apply calculations to smaller regions of interest at higher resolution. For example we might only reconstruct dense stereo for image regions on or near the ground plane rather than the 15
  16. 16. entire image. The calculations we anticipate include edge and corner detection, image differenc- ing, blob finding and tracking, and image correspondence. Energy Reduction Through Integrated System Design: One of the primary challenges facing the Landroids project is making the most efficient use of the available energy while deliv- ering reliable data forwarding. This entails conserving energy at every stage of movement, detec- tion and data processing. During motion phases, the majority of the system power will be used to power the drive train and the majority of the system energy can be saved by efficient planning and suppressing robot motion that is not useful, as described in other portions of the proposal. During non-motion phases, the majority of energy will be used for transmissions and relay- ing communication or, during idle periods, in determining when to actively engage in communi- cation. Despite these different “phases” an integrated design avoids micro-optimizing the power budget of a single component at the cost of overall performance or power budgets. Reducing Energy During Planning Phases: During the planning phase, the LANdroid needs to make the most use of computation and sensors to reduce movement. Computation costs can be controlled by using dynamic voltage scaling (DVS) to reduce CPU costs. Depending on the particular component, voltage scaling can reduce the power needed by a processor such as the Intel PXA 255 by a factor of 6 [SRH05] by switching the CPU speed from 400Mhz to 100Mhz and reducing the voltage from 1.3V to 1V. Most work on DVS has demonstrated that it is difficult to extract energy savings from DVS. Our experience [GLF00] has been that automat- ed voltage scaling methods perform poorly for applications considered in isolation; others [FlM02] have shown that capturing application interactions improves automated power control if deadlines can be inferred or provided. However, sensing applications, such as trying to locate an adjacent access point are more concerned with reducing power since the mission time can not usually be decreased by faster computation. In this application domain, we think the individual applications can provide sufficient information to allow automated control with out attempting to estimate the computing demands of the system. Sensors can reduce system energy by reducing the need to move. As described, vision based sensing can be used for semantic analysis and gradient control methods can follow localized en- ergy. RF sensing can be improved by a combination of protocols, directional antennas and a broad spectrum RF sensing. We plan on using our experience with the SoftMAC [NFD05] flexi- ble MAC layer control to provide more accurate protocol layers for direct control of the MAC layer. For example, using SoftMAC, we can selectively enable enhanced forward error correc- tion (FEC) to improve the channel condition; we can also use the SoftMAC drivers to retrieve partially corrupted frames, improving the channel measurement accuracy. The directional anten- nas provide gross angle of arrival (AOA) information, allowing gradient based methods to im- prove accuracy – more importantly, directional antennas also increase throughput and decrease communication costs. While “higher level” tools such as SoftMAC allow us to extract useful sig- nal strength information from an existing 802.11 radio, those radios still face limitations – they are power hungry and have limited ability to find “idle” channels. Packets that have significant corruption (e.g. in the Physical Layer Converge Protocol, or PCLP), can not be interpreted by the radio; most existing radios have poor interfaces for signal measurements, limiting the ability of radios to assess the “true channel”. We plan on using spectrum measurement systems to provide an alternate, lower power sensing method for the LANdroid platforms; an example of a COTS solution is the WiSPY2.4 device, which can scan the entire 2.4Ghz band in 383Khz “steps” to measure noise and signal strength. 16
  17. 17. time Figure 4- Sample RF Energy From Inexpensive Signal Detector freq For example, Figure 4 shows an experiment conducted in a concrete walled courtyard; a transmitter (blue square, upper left) was transmitting an 802.11g stream to the receiver (red box, bottom right). The diagram to the right shows a “waterfall” plot of the received signal strength at the receiver as it switches between four antenna states (performed by manually rotating the an- tenna). The spectrum analyzer is clearly able to provide information for gradient-based RF find- ing, and it can examine a broader band more rapidly than the 802.11g radio. However, it can not identity what signals mean – i.e. what data is actually packets from adjacent LANdroids. Howev- er, just as the 802.11g radio has poor abilities to rapidly scan the network to find “idle channels”, both the radio and spectrum analyzer have their advantages when used cooperatively. We have made extensive use of combined spectrum sensing and radio use in developing cognitive radio control algorithms and mesh networking systems with directional antennas that have been evalu- ated using SoftMAC and the WiSPY devices [WSD07, SDH06, BAY07]. Reducing Energy During Communi- cation Phases: The single largest energy consumer during communication phases is the use of the underlying radio and the MAC and PHY processing for that radio; consider- able variation occurs between different RF chipsets, but many factors are independent of the chipset. The graph in Figure 5 shows the time needed to transmit a data frame of vary- ing sizes at different data rates. This mea- sured data shows two things – there’s consid- Figure 5 - Transmission Time For Different erable overhead for small packets and when a 8 802.11g Data Rates high quality link is in place, the amount of transmitted data is not important. The time to transmit any packet is approximately 1.49ms at the lowest data rate, 1.19ms at the highest 802.11b data rate and 0.227ms at the 54Mb/s data rate. During transmission, both the sender and receiver radios are operating, consuming considerable power. If we assume that the primary task of the LANdroid relays will be to relay voice traffic, we can see that system level decisions should guide power optimization – at 11Mb/s or 54Mb/s, the transmission time for a VOIP packet is largely independent of the underlying codec; at 1Mb/s, transmission time can vary from 1.9ms for an 8kb/s G.729 codec to 3.0ms for a 64kb/s G.711 code. At times, it may be useful to trade the costs of transcoding streams against in- creased radio use. This data also shows the necessity of increasing the link quality using directional antennas, 17
  18. 18. which improve gain and reduce air time, or through mobility and transmission power increases. 18
  19. 19. D.2 Comparison with Current Technology The decentralized model-free, gradient-based, motion optimization technique developed by the PIs overlaps with several state-of-the-art control techniques. Multivariable extremum seeking control has been applied to a variety of applications [ArK02], but never to teams of autonomous robots. Current work on MES has applied the approach to the control of a single nonholonomic vehicle with no position information [ZAG07]. The approached proposed by the PIs uses local position information obtained by odometry and enables coordination between vehicles. Likewise, the concepts of swarming, distributed sensing, and coverage control have been studied extensive- ly in the context of physical space (see [IEE07] for an overview of recent results). These results all assume relative position and models of the interaction environment are known. Our approach extends all of these results to cases when positions (in any space) and models are not known. Controlling node mobility for communication has been considered by other researchers. The value of controlling mobility to increase connectivity was shown in [LiR00, BRS03, BaR04, CN- S01]. These generally consider outdoor networks or idealized communication models that do not capture the tight interaction between robot navigation and network optimization. In [SBR04] a protocol was analyzed that guided end users to better locations that improve their communication quality. This requires significant end-user interaction that is not desired here. Using network in- formation to update node locations was specifically considered in [GLM04]. The nodes move on long time scales (1-5 min) and are not sufficiently dynamic for LANdroids. All of these refer- ences fail to properly consider the energy cost of mobility relative to increasing transmitter pow- er or other means of improving network performance. They also ignore the mobility planning and assume (usually implicitly) some localization technique such as a GPS. Data ferrying for delay tolerant networks (DTN) was considered in [MAZ04,SRJ03,ZhA03]. Ferrying exploits or controls mobility in order to deliver messages in disconnected networks and is similar to epidemic routing and similar models [VaB00,Win00] that rely on mobility to diffuse messages across nodes eventually reaching the intended recipient. While this does not satisfy the LANdroid low-latency connectivity goal several concepts are relevant. Throwboxes described in [ZCA06] are network relays that are placed at important traffic intersections to act as store and forward devices that increase network throughput. The approach to identifying these intersec- tions may be useful in positioning LANdroid relays. In [ZAZ04] the authors consider the broader problem of how DTN nodes signal their communication needs in a disconnected network. Such a protocol may inform the need to initially self-organize LANdroid nodes. The Learning Spatial Features approach proposed here is related to concept drift, where the characteristics of a class change over time [WiK96, KoM05, HeL92]. The approach requires the use of multiple models, and the goal is to make future predictions by choosing between models based on the latest labeled data. However, these methods do not offer a formal approach to ques- tions such as which models to apply to the unlabeled current sample, or when to add new models or discard old ones. Thrun’s [Thr96] work also directly addresses a similar problem, as well as online learning as discussed in [Thr96]. However, none of these methods can directly predict when an unlabeled input is beyond the scope of the current model set, and more learning must be done. Furthermore, there has been a recent issue of the Journal of Field Robotics (co-edited by the CO-PIs) that has addressed similar learning problems [MuG06]. Related Reinforcement Learning work can be found in [SBP04]. 19
  20. 20. D.3 Evaluation/Experimentation Plans and Metrics University of Colorado’s operator interface for the DARPA LAGR program left (robotics), and for the ad hoc UAV and ground node network (AUGNet) right (WiFi networking). D.3.1 Operator Control System Our past experience with test bed environments has shown that it is important to have a test management plane[JBD05,BDJ05]. The management plane provides the ability to issue com- mands (e.g. to start an experiment), monitor ongoing progress (e.g. node location and network activity summaries), and collect detailed network statistics. The figure above shows a sampling of interfaces that we have developed. We have been able to design “in-band” systems that share the network under test for network monitoring with minimal impact. For detailed monitoring the data can be held at nodes and collected after a test is finished using monitoring protocols that we have developed. Our previous work used GPS to monitor node locations but this can not be used in the indoor environment. We will test both out of band video systems located in the ceiling and an existing UWB indoor localization system acquired under a previous contract. It is important to note that we are only using localization information to determine when and why the developed systems run into problems; they are not used to guide the decisions of the LANdroid system. D.3.2 Test Bed Environment Our combined research groups have considerable experience with a variety of robots, as in- dicated in the Facilities section. We have an existing collection of 12 Roomba systems outfitted with a custom radio and control system that can be used in this project; we have also budgeted equipment for additional Roomba Create compo- nents and embedded computers. Robots will be instrumented to monitor power usage. This will enable us to test the effect of different robot hard- ware and software strategies in short experiments without having to test until the entire battery is drained. We will use various testbed environments. Day-to-day testing and evaluation will occur in re- search labs in the Department of Electrical and Computer Engineering and in Computer Science. 20
  21. 21. These large labs have already been used for robotics experiments, and have a variety of construc- tion types. In addition, we will use the Center for Innovation and Creativity (CINC); this large facility is a former manufacturing facility converted to University use. The floor plan is shown in the picture above. That diagram shows the location of existing directional phase array antennas in the large building and the variety of room shapes and sizes available. Each of these facilities has existing 802.11 wireless networks, providing the benefit of background interference; we also have access to an interference free location at a local warehouse. D.3.3 Simulation Environment For rapid algorithms development we will modify an existing network simulator that incor- porates detailed radio models and controlled mobility. Open source programs such as ns2 only provide predefined mobility paths that can not be modified during the simulation. Further, the standard propagation models are weak. We have modified OpNet to incorporate controlled mo- bility and more accurate propagation models. The radio performance will incorporate measure- ment data between location pairs to provide a simple, but, accurate radio model. The simulator will require further development to incorporate our system power models and sensor interac- tions. We believe that relatively simple power models will be sufficient for the majority of the work; the radio components have non-varying power levels, the power demands of the RAM and FLASH components are constant and power models for statically scheduled microprocessors, such as the PXA 255, are accurate and have been well studied [GiGr00]. One challenge will be to provide a simplified or abstracted power model that can be used in on-line decision making; various learning algorithms will need to be able to assess the benefit of trading one resource (e.g. radio) against another (e.g. specific computation). Again, the project team has considerable experience in these tasks. D.3.4 Metrics In addition to the five program-wide metrics (Coverage, Longevity, Throughput/Latency, Convergence Time, and Message Overhead) we also consider three additional metrics. The first is Dynamic Antenna Gain. This measures the ability of the robot to find positions that provide gain to a specific test point. It is measured as a dB gain in received signal strength relative to the local median signal strength in the vicinity of the robot. The second is Plan Energy. This mea- sures the total energy (movement, processing, and communication) to achieve a specific goal. Goals can be to move to a specific position in a cluttered environment or to find a local minimum of signal gradient. This will be used to compare different hardware and software strategies for robot operation and is at a finer grain then Longevity. The third is Reach. This measures the maximum distance a test point can communicate with the gateway for a given number of robots within a given environment. This is designed to be a better measure of tethering performance. Convergence Time appears to require further clarification. We propose that it is the time past an event until a minimum percentage of test points can communicate (as defined in Cover- age) with the gateway. Since this may never happen, convergence time is augmented by Conver- gence Probability which is the fraction of the attempts that the system converges on to a mini- mum Coverage percentage. 21
  22. 22. E. Statement of Work The overall goals of the University of Colorado LANdroid Team have been outlined in Sec- tion C. Broadly, we plan to integrate multiple sensor, communication, and system measurements in order to control the network and robot so that connectivity and network longevity are maxi- mized. We have divided the problem into nine components: RF Analysis; LANdroid System Hardware; System Power Models; Decentralized, Model-Free, Gradient-based, Motion Opti- mization in Signal Space; Learning Spatial Features; Learning Optimal Control; Vision Based Cost Estimation; Network Protocols; and System Software Development and Evaluation. A de- tailed breakdown of these components into their constituent tasks is provided below. For each task, we indicate which PI (and their associated graduate students) is responsible for the task and in which phase the task will be completed. At a high level, the three phases of this work are to 1) identify the technologies that will best constitute the LANdroid system and provide design guidelines to the LANdroid hardware teams; 2) integrate these technologies into a coherent con- trol system; and 3) integrate the control software with the hardware platform. The tasks are tied to the Milestones which are described at the end of this section and deliverables in Section I. However, the tasks will generally encompass the entire phase(s) indicated. Variations will be in- dicated in the schedule graphic. RF Analysis (RFA): Each task will support Deliverable 1 (RF Analysis document). Task 1: (P1, Grunwald) Survey signal strengths using dense measurement networks, compar- ing different signal strength measurement techniques (WiFi card vs. WiSpy). Task 2: (P1, Popovic) Characterize the role of the antenna, including form-factor, directional- ity, and multiple antennas in near-to-ground indoor/urban environments. Task 3: (P1, Brown) Identify and characterize spatial features that are desirable in signal space. LANdroid System Hardware (LSH): Each task will support Deliverable 2 (LANdroid System Hardware Recommendation document). Task 4:(P1, Grunwald) Analyze tradeoffs in costs, energy requirements, and system perfor- mance for sensors, 802.11 chipset, and processor on gateway, LANdroid, and edge node. Task 5:(P1, Mulligan) Analyze tradeoffs in number and capability of video cameras. Task 6:(P1, Popovic) Analyze tradeoffs in number and capability of antenna. System Energy Models (SEM): Task 7:(P1, Grunwald) Develop generic energy models communication, sensor, robot, and processing sub-systems. Supports Deliverable 2 and Milestone 6. Task 8:(P1, Grunwald) Characterize the added network longevity vs. cost of different subsys- tem choices. Supports Deliverable 2. Task 9:(P3, Grunwald) Integrate final hardware design into energy model. Supports Mile- stone 9. Gradient-Based Motion Optimization in Signal Space (GMO): Task 10:(P1, Frew) Determine most efficient signal dithering method for determining gradi- ent. Provides input to Deliverable 1. Task 11:(P1, Frew) Design and implement basic gradient-based robot deployment algorithm. 22
  23. 23. Supports Milestone 1. Task 12:(P2, Frew) Design and implement gradient-based robot tethering algorithm. Supports Milestone 5. Task 13:(P3, Frew) Design and implement gradient-based methods that search in perfor- mance space (e.g. power or throughput). Supports Milestone 8. Learning Spatial Features (LSF): Task 14:(P1, Grudic) Investigate many types of spatial features rated by humans for their net- work utility. The resulting features will be evaluated in an offline machine learning paradigm. Supports Milestone 4. Task 15:(P2,P3, Grudic) Design and implement online machine learning to optimize the learned spatial feature concepts. Supports Milestone 7. Task 16:(P3, Grudic) Develop distributed spatial feature detection between LANdroid nodes. Supports Milestone 7. Learning Optimal Control (LOC): Task 17:(P1, Grudic) Adapt LAGR path planning code to the LANdroid environment. Sup- ports Milestone 2. Task 18:(P2, Grudic) Investigate methods for the LANdroid to learn from human operater demonstrated control strategies. Supports Milestone 4. Task 19:(P3, Grudic) Design and implement a reinforcement learning controller that dynami- cally determines standard control behaviors and combines signal and spatial optimization. Supports Milestone 7. Vision-Based Cost Estimation (VCE): Task 20:(P1, Mulligan) Implement and refine algorithms for compute and power limited vi- sion to support path planning and spatial feature identification. Supports Milestone 2. Task 21:(P2, Mulligan) Design and implement methods for estimating relative pose among LANdroids and situational awareness. Identify environmental precepts that facilitate LAN- droid function. Supports Milestone 5. Task 22:(P3, Mulligan) Design and implement Warfighter tracking algorithm. Supports Mile- stone 7. Network Protocols (NP): Task 23:(P1, Brown) Select and adapt MANET protocol implementation to LANdroid robot. Supports Milestone 1. Task 24:(P1, Brown) Implement protocol for connecting disconnected nodes. Supports Mile- stone 2. Task 25:(P2, Grunwald) Implement resource management protocol that enables LANdroids to share resource state information, manage resources, and interface with its own re- sources. Supports Milestone 6. Task 26:(P2, Brown) Implement energy aware routing protocol that considers the larger set of LANdroid resources and node states. Supports Milestone 6. Task 27:(P3, Grunwald) Implement quality of service aware routing protocols for meeting throughput, latency, or longevity targets of end nodes. Supports Milestone 8 23
  24. 24. System Software Development and Evaluation (SDE): Task 28:(P1, Mulligan) Design and implement operator control system (OCS) for managing LANDroid testing. Supports Milestone 1. Task 29:(P2, Brown) Develop evaluation test bed for understanding LANdroid system behav- ior in complex environments. Includes both simulation, hardware in the loop, and full hardware environments. Supports Milestone 4. Task 30:(P2, Frew) Design control software to robot interface. Supports Milestone 9. Task 31:(P3, Mulligan) Implement software modules on LANdroid robot hardware. Supports Deliverable 9. Milestone 1: 10 LANdroids can self-configure when initially connected to provide connectivity to static test points in an open space on a single floor. Milestone 2: 10 LANdroids can self-configure when initially disconnected to provide connectiv- ity to static test points in an open space on a single floor. The network self heals after node death. Milestone 3: Operator control software can operate multiple LANdroid robots and collect situa- tional awareness and monitoring data. Milestone 4: 15 LANdroids can self-configure (whether initially connected or not) to provide connectivity to test points across two floors with static obstacles. Milestone 5: The network maintains connectivity to mobile test points (tethering). Milestone 6: Network energy is managed as a network resource. Milestone 7: 50 LANdroids can self-configure (whether initially connected or not) to provide connectivity to heterogeneous test points across 3+ floors of dynamic obstacles and RF inter- ference. The network maintains connectivity to mobile test points (tethering). Energy is man- aged as a network resource. Milestone 8: Edge nodes can specify network objectives (e.g. longevity or throughput). Milestone 9: All control software ported to LANdroid robot hardware. 24
  25. 25. F. Schedule Graphic LANDroids Phase 1 Phase 2 Phase 3 Year 07 2008 2009 2010 Month Dec Jan Feb Mar Apr MayJun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr MayJun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr MayJun Jul Aug Sep Oct Nov T1 RFA: RF survey T2 RFA: Charaterize Antenna Role T3 RFA: Identify Spatial features T4 LSH: Hardware tradeoffs T5 LSH: Video tradeoffs T6 LSH: Antenna tradeoffs T7 SEM: Generic Energy Model T8 SEM: Network Longevity vs. Cost Model T9 SEM: LANdroid specific energy model T10 GMO: Signal Dithering T11 GMO: Signal Gradient Following T12 GMO: Gradient-based tethering T13 GMO: Performance Gradient Following T14 LSF: Offline Spatial Feature Learning T15 LSF: Online Spatial Feature Learning T16 LSF: Distributed Spatial Feature Identification T17 LOC: Adapt LAGR Code T18 LOC: Learning Strategies from humans T19 LOC: Reinforcement Learning Controller T20 VCE: Power-limited robot vision T21 VCE: Network pose and situation T22 VCE: Warfighter tracking T23 NP: MANET selection T24 NP: Connecting disconnected nodes. T25 NP: Resource Management Protocol T26 NP: Energy aware routing T27 NP: Quality of service aware routing T28 SDE: Operator Control System T29 SDE: Evaluation Testbed T30 SDE: Control Software to Robot interface Spec T31 SDE: Software on LANDroid robot hardware Evaluations Milestones M1 M2 M3 M4 M5 M6 M7 M8 M9 25
  26. 26. G. Teaming and Tasking To be successful the proposal requires a confluence of knowledge and capabilities in RF propagation, wireless protocols, robotics, and machine learning. The team has these capabilities along with key cross-disciplinary capabilities that ensure a smooth and productive interaction: Tim Brown (PI): 18 years experience in Wireless Systems, Machine Learning, and Network- ing Eric Frew: 11 Years Experience in Design and Implementation of Control Systems for Autonomous Robotic Networks Greg Grudic: 13 Years Experience in Learning Algorithm Development and Robotics Re- search. Dirk Grunwald: 20 years experience in computer systems design, evaluation and develop- ment Jane Mulligan: 12 Years Experience in Implementation of Robotic Control Systems and Real-Time Stereo and Vision Algorithms. Zoya Popovic: 22 years experience in microwave circuit and antenna modeling, design and characterization. In addition the proposal will be supported by a dedicated System Integrator whose task will be to lead the LANdroid system evaluation. In summary, the team has a depth of capabilities in each of the key areas as shown below. Team Capabilites Wireless Machine RF propagation Robotics protocols Learning Popovic Mulligan Grunwald Grudic Brown Frew Brown Brown Grunwald Grudic Each of the six faculty will be responsible for the tasks assigned to them in the Statement of Work section. In addition, the PI is the technical point-of-contact and responsible for the overall coordination of the project, periodic reporting, and financial management. This project will be placed in and managed through the Department of Electrical and Computer Engineering. The ECE Department is headed by Prof. Mike Lightner who is a full-time faculty member. Therefore, the PI, Prof. Brown will report to Prof. Lightner for program support, report- ing, and oversight. The departments report to the Dean of the College of Engineering and Applied Sciences, Prof. Robert H. Davis. 26
  27. 27. H. Project Management and Interaction Plan The entire team is collocated in the Engineering Center building at the University of Col- orado. Weekly team meetings will track progress and provide forward planning. Meetings will include faculty and students who will discuss project management as well as research findings. We will develop an internal CU LANdroid Wiki to facilitate technical discussion further and document progress. This Wiki will also be integrated with version control for all robot and net- work control software. All software, test bed monitoring, data measurements, technical reports, and papers will be archived on a server that will be backed up nightly. Monthly full-scale exercises will evaluate LANdroid progress and program needs. 27
  28. 28. I. Deliveries Description There are no Proprietary Claims. Technical data and computer software will be furnished to the Government with Unlimited Rights. Deliverable 1: RF Analysis document. Based on our measurements and analysis, this describes the potential for exploiting the RF environment and cost-effective techniques that are best suited for this exploitation. Deliverable 2: LANdroid System Hardware Recommendation document: Based on our systems level analysis of choices in sensors, radios, and computation; we will provide cost-benefit analy- sis of the potential hardware components in the LANdroid System. Deliverable 3: Documentation of all software modules. Deliverable 4: Documentation of all LANdroid system evaluations. Deliverable 5: Documentation of the final complete software system. Deliverable 6: Control software written for the final LANdroid hardware platform. 28
  29. 29. J. Technology Transition and Technology Transfer Targets Plans The development of self-organizing robotic network nodes has immediate and obvious ap- plications in a number of fields. These include: 1. Public Safety. A network that can dynamically adapt and optimize performance with minimal intervention would improve current police, fire, and medical communications in urban indoor environments. 2. LANcraft. In outdoor environments, wireless nodes mounted in unmanned aircraft can pro- vide optimized mobile networks over large areas. LANcraft can operate independently or can interoperate with LANdriod networks adding another dimension. 3. Consumer Deployments. Automatically identifying wireless access point locations to provide coverage can simplify deploying building networks. The converged locations of a LANdroid network would indicate where access points should be installed. With no additional investment, the algorithms developed for this application could be appli- cable to many other uses for the DOD and public safety markets. The algorithms would have ob- vious potential for performing Future Combat Systems (FCS) or Joint Robotics Program (JRP) missions. The development of learning behaviors would directly benefit and potentially feed the FCS Autonomous Navigation System (ANS) program which will be applicable to all unmanned FCS operations. The network optimization algorithms can also support the DARPA next genera- tion (XG), wireless network after next (WNaN), wireless adaptable network node (WANN) and WNaN adaptive network development (WAND) programs which focus on expanding low-cost radio capabilities. The team is working on several of these projects. This research has potential spin-offs in local companies such as Cardinal Peak, Louisville CO, which is currently funded by the NSF to study mobile robot relays to optimize video backhaul for public safety. Using existing sensors and added payloads, the LANdroid could assume additional roles such as distributed surveillance (audio, video, vibration) to support the urban indoor war fighter. With some adaptation, the technology logic can be inverted to communication disruption. Using many of the LANdroids techniques, mobile indoor electronic countermeasure networks could jam, eavesdrop, or localize emitters. Results will be published in public forums in order to disseminate and promote the transfer of this technology. 29
  30. 30. K. Personnel and Qualifications Dr. Timothy X Brown, Associate Professor • 18 years experience in Wireless Systems, Machine Learning, and Networking. • Technical Program Committee for ACM International Symposium on Mobile Ad Hoc Net- working (MobiHoc) 2004, 2007. • Member National Research Council Committee on Using Information Technology to Enhance Disaster Management, 2005–2007. • Sub-contractor on Phase 3 of DARPA XG program. • Ph. D, Electrical Engineering, California Institute of Technology. • Dr. Brown’s Ph.D. thesis topic was a neural network framework for solving switching net- work design problems. Between 1990 and 1992 he worked in an advanced computing archi- tectures group at the Jet Propulsion Laboratory where he developed novel neural network ASIC designs. Between 1992 and 1995 he was a member of technical staff at Bell Communi- cations Research where he developed machine learning techniques for network control. Since 1995 he has been a Professor in Electrical and Computer Engineering at the University of Colorado, Boulder. He has published research papers in machine learning, wireless systems, and networking. Dr. Brown’s work in machine learning includes statistical function approxi- mation of rare events; and adaptation in network and wireless communication systems using Markov Decision Process formulations solved with reinforcement learning. His work in wire- less systems and networking includes wireless user mobility models; analysis of random cel- lular deployments; energy aware ad hoc routing protocols; delay tolerant network routing; and adaptive network resource allocation. To support the wireless research he has developed a large-scale outdoor wireless test bed that incorporates real-time monitoring and visualization of network performance down to the packet level. His current research is on controlled mobil- ity in ad hoc networks, especially on small unmanned aircraft. His published ad hoc network- ing protocols (including variants of DSR, AODV, and DTN) have all been implemented on handheld or small single board computers deployed in indoor, outdoor, vehicular, and/or aeri- al networks. Dr. Eric W. Frew, Assistant Professor • 11 Years Experience in Design and Implementation of Control Systems for Autonomous Robotic Networks. • PI of AFOSR Project titled “An Integrated Framework for Controlled Mobility in Ad Hoc Networks” • Member of the Research and Engineering Center for Unmanned Vehicles (RECUV) at the University of Colorado. • PhD, Department of Aeronautics and Astronautics, Stanford University • Dr. Frew’s research efforts focus on the exploitation of controlled mobility for integrating communication into multi-objective control, optimal distributed sensing by teams of au- 30
  31. 31. tonomous vehicles, and self-directed collaborative navigation of unmanned aircraft. Prior to joining the CU Boulder faculty, he was a postdoctoral researcher at the UC Berkeley Center for Collaborative Control of Unmanned Vehicles (C3UV) from June 2003 through July 2004 where he oversaw the development and flight demonstrations of a fleet of three intelligent aerial platforms. Prior to that, he worked with unmanned ground vehicles and the Humming- bird autonomous helicopter at the Stanford University Aerospace Robotics Lab. Dr. Frew has been involved with successful Air Force STTR/SBIRs and current funding comes from the Air Force Office of Scientific Research, USAF Materiel Command, and Raytheon IIS. Dr. Greg Grudic, Assistant Professor • 13 Years Experience in Learning Algorithm Development and Robotics Research. • Co-organizer with Mulligan of the 2005 NIPS workshop on Machine Learning Based Robotics in Unstructured Environments • (http://www.cs.colorado.edu/janem/NipsMLR.html). • Co-editor with Mulligan of the 2006 Journal of Field Robotics Special Issue on Machine Learning Based Robotics (http://www.journalfieldrobotics.org/index.html). • PI in Phase 2 of the DARPA LAGR program. • PhD, Electrical and Computer Engineering, University of British Columbia • Dr. Grudic’s Ph.D. thesis topic was on nonparametric learning from examples in very high di- mensional state spaces, which produced a machine learning framework for end-to-end learn- ing of robot navigation tasks. Between 1998 and 2001 he was a Post Doctoral Fellow at the GRASP lab at the University of Pennsylvania. Since2001 he has been an assistant professor in the Computer Science department at the University of Colorado at Boulder. He has published research papers in both Machine Learning and Robotics. As part of his ongoing research in human-to-robot skill transfer and end-to-end learning of robot tasks, his current research focus is on probabilistic regression and classification, clustering, semi-supervised learning, outlier detection, and low dimensional nonlinear manifold representations of robot sensory space. Dr. Grudic’s research in machine learning includes papers on classification, regression, semi-su- pervised classification, clustering, outlier detection and reinforcement learning. His research in robotics includes published papers on end-to-end learning of task driven mobile robot navi- gation tasks, reinforcement learning for mobile robot navigation, and inverse kinematics for high degree of freedom robot manipulators. Dr. Dirk Grunwald, Associate Professor • More than 20 years experience in computer systems design, evaluation and development • Contributor to DARPA-sponsored book on power aware computing • Participant in winning DARPA WANN hardware design team with M/A-COM • Broad experience in computer systems, including all aspects of the LANDROID platform • Ph.D. Computer Science, University of Illinois Urbana-Champaign • Dirk Grunwald is an Associate Professor at the University of Colorado, Boulder. Dr. Grun- 31