Successfully reported this slideshow.
Upcoming SlideShare
×

# Short-Term Conflict Resolution for Unmanned Aircraft Traffic Management

450 views

Published on

We propose a set of controllers to solve the stochastic short-term conflict avoidance problem, in which a traffic management system supervises a collection of aircraft. The controllers generate advisories for each aircraft, and are based on decomposing a Markov decision process and fusing solutions to its subproblems online.

Published in: Engineering
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

### Short-Term Conflict Resolution for Unmanned Aircraft Traffic Management

1. 1. Short-Term Conﬂict Resolution for Unmanned Aircraft Traﬃc Management Hao Yi Ong and Mykel J. Kochenderfer Stanford University September 22, 2015
2. 2. Outline Introduction Mathematical formulation Approach Numerical experiments UTM-α: A distributed framework Conclusion and future work Introduction 2
3. 3. Unmanned aircraft traﬃc management automating conﬂict avoidance is critical to integrating small unmanned aerial systems (UAS) to civil airspace automation to augment controllers Introduction 3
4. 4. Conﬂict avoidance given potential conﬂicts between drones, ﬁnd the best set of advisories focus on horizontal or co-altitude conﬂict resolution – NASA’s proposed airspace for UAS is under 150 m (500 ft) – ﬂight altitudes need to avoid disruption and buildings conﬂict deﬁned as loss of minimum separation between aircraft Introduction 4
5. 5. Complications multiagent problem – system must coordinate between many aircraft – large search space for solution uncertainty in system – imperfect sensor measurements of current state – variable pilot response, vehicle performance, etc. aﬀect future path appropriate trade-oﬀ between safety and eﬃciency not obvious Introduction 5
6. 6. Goals robust and eﬃcient method tractable approach to complex stochastic problem – account for uncertainty in large problem with reasonable compute resources balances between airspace safety and eﬃciency – ensure safety and provide timely conﬂict alerts to aircraft arbitrary-scale optimization – real-time conﬂict resolution for large airspace Introduction 6
7. 7. Previous approaches mathematical programming (MIP, SCP) [SVFH05, ASD12] – can work well for simple vehicle networks distributed convex optimization [OG15] – hard to incorporate stochastic objectives/constraints Markov decision process formulation (ACAS X) [KC11, CK12] – assumes “white noise” accelerations for intruder aircraft and many more. . . [KY00] Introduction 7
8. 8. Overview idea: decomposition + coordination multi-agent pairwise joint policy decompose pairwise solutions + coordination = ac1: left ac2: right ac3: straight Introduction 8
9. 9. Conﬂict Resolution as a UTM service UTM server client side resolution server filtered sector status updates advisories advisories status updates Introduction 9
10. 10. Conﬂict Resolution as a Standalone service derivative UTM client service – subscribes to UTM server to track aircraft and deconﬂict routes – pub-sub system that operators subscribe to to receive advisories current implementation uses standalone model – still unclear about end architecture for resolution service – clean approach to get started with prototype system – implemented with scalability and modularity in mind Introduction 10
11. 11. Outline Introduction Mathematical formulation Approach Numerical experiments UTM-α: A distributed framework Conclusion and future work Mathematical formulation 11
12. 12. Markov decision process (MDP) deﬁned by the tuple (S, A, T, R) S and A are the sets of all possible states and actions, respectively T (s, a, s ) gives the probability of transitioning into state s by taking action a at the current state s R (s, a) gives the reward for taking action a at the current state s environment T(s,a,s’)! agent action a! state s! reward R(s,a)! Mathematical formulation 12
13. 13. Value iteration agent π⋆(s)" action a" state s" want to ﬁnd the optimal policy π (s) – gives action that maximizes the utility Q (s, a) from any given state π (s) = argmax a∈A Q (s, a) value iteration updates value function guess ˆQ until convergence – expensive one-oﬀ compute but cheap policy extraction ( ˆQ lookup) ˆQ (s, a) := R (s, a) + s ∈S T (s, a, s ) max a ∈A ˆQ (s , a ) Mathematical formulation 13
14. 14. Multiagent MDP extension of MDP to cooperative multiagent setting similar to case where centralized planner has access to system state also deﬁned by the tuple (S, A, T, R), except – S and A are all possible joint states and actions – T and R operate on elements of S and A Mathematical formulation 14
15. 15. Short-term conﬂict resolution horizontal conﬂict resolution between n aircraft – conﬂict deﬁned as loss of minimum separation distance, 500 m – aircraft at risk if it could experience a conﬂict within two minutes alert aircraft that need corrective maneuvers – but not to the extent that alerts become a nuisance Mathematical formulation 15
16. 16. Resolution advisories at each time step T = 5 s, system issues a joint advisory – joint advisory φ chosen out of a ﬁnite set of corrective bank angles for each aircraft action set A = {−20◦ , −10◦ , 0◦ , 10◦ , 20◦ , COC} n – positive and negative angles correspond to left and right banks – COC is a clear-of-conﬂict status advisory—nothing needs to be done higher resolution comes at the cost of computational complexity Mathematical formulation 16
17. 17. Dynamics ith aircraft state si = (xi, yi, ψi, vi, pi) – latitude and longitude xi and yi – heading ψi – groundspeed vi – responsiveness indicator pi aircraft follow Dubin’s kinematic model – simplicity avoids risk of overﬁtting complicated models ˙xi = vi cos ψi ˙yi = vi sin ψi ˙ψi = g tan φi vi ψi! bank: φi! (xi ,yi)! vi! Mathematical formulation 17
18. 18. Pilot model advisory response determined stochastically by Bernoulli process 5 s average response delay to ﬁrst corrective advisory as recommended by ICAO [ICA07] – when responding, the pilot executes the advisory for 5 s time step non-responsive: white noise path responsive: corrective action COC 5 s average response lag Mathematical formulation 18
19. 19. Reward function balance competing objectives maximize safety loss of separation cost separation closeness penalty minimize disruption cost advisoryCOC ±0° ±10° ±20° conflict penalty Mathematical formulation 19
20. 20. Outline Introduction Mathematical formulation Approach Numerical experiments UTM-α: A distributed framework Conclusion and future work Approach 20
21. 21. Curse of dimensionality in an MMDP, S and A are all possible joint states and actions problem: S and A scale exponentially with number of agents 3 drones: 1000 states 2 drones: 100 states 1 drone: 10 states Approach 21
22. 22. Decompose and coordinate multi-agent pairwise joint policy decompose pairwise solutions + coordination = ac1: left ac2: right ac3: straight Approach 22
23. 23. MMDP decomposition multi-agent pairwise joint policy decompose ac1: left ac2: right ac3: straight pairwise solutions + coordination = Approach 23
24. 24. Pairwise conﬂict resolution decompose full MMDP into O n2 pairwise encounters action set, pilot response model, and reward function unchanged use relative positions and bearing to reduce state space – arbitrarily set one aircraft as ownship, and the other as intruder – speeds remain in global reference frame s = xrel , yrel , ψrel , vownship , vintruder , pownship , pintruder (0,0)! vownship ! ψrel (xrel ,yrel)! vintruder Approach 24
25. 25. Discretization in general, no analytical solution to problem with continuous state space approximate value function ˆQ over discretized state space – discretize state space via multilinear interpolation – iteratively update ˆQ until convergence ˆQ (s, a) := R (s, a) + s ∈S T (s, a, s ) max a ∈A ˆQ (s , a ) Approach 25
26. 26. Approximating environment noise sigma-point sampling speed bank weights sigma-point sample Approach 26
27. 27. Implementation discretize continuous state space into 9.6 million states – expensive but one-oﬀ computation ˆQ converges in 7 hours (40 iterations over S × A) – solver: parallel implementation in Julia – machine: 20 2.3 GHz Intel Xeon cores with 125 GB RAM Variable Minimum Maximum Number of values xrel , yrel −3000 m 3000 m 51 ψrel 0◦ 360◦ 37 vownship , vintruder 10 m s−1 20 m s−1 5 pownship , pintruder - - 2 Approach 27
28. 28. Pairwise policy visualization passing intrusion (240◦ bearing) COC −2,000−1,000 0 1,000 2,000 −2,000 −1,000 0 1,000 2,000 x (m) y(m) Ownship action COC −2,000−1,000 0 1,000 2,000 −2,000 −1,000 0 1,000 2,000 x (m) Intruder action −20 −10 0 10 20 Approach 28
29. 29. Example advisory execution passing intrusion at (1000 m, 0 m) ownship: 10° left bank intruder: 10° left bank Approach 29
30. 30. Accounting for pilot response pilots responsive (top) and non-responsive (bottom) COC −2,000−1,000 0 1,000 2,000 −2,000 −1,000 0 1,000 2,000 x (m) y(m) Ownship action COC −2,000−1,000 0 1,000 2,000 −2,000 −1,000 0 1,000 2,000 x (m) Intruder action −20 −10 0 10 20 COC −2,000−1,000 0 1,000 2,000 −2,000 −1,000 0 1,000 2,000 x (m) y(m) Ownship action COC −2,000−1,000 0 1,000 2,000 −2,000 −1,000 0 1,000 2,000 x (m) Intruder action −20 −10 0 10 20 Approach 30
31. 31. Accounting for pilot response ownship responsive (top) and intruder responsive (bottom) COC −2,000−1,000 0 1,000 2,000 −2,000 −1,000 0 1,000 2,000 x (m) y(m) Ownship action COC −2,000−1,000 0 1,000 2,000 −2,000 −1,000 0 1,000 2,000 x (m) Intruder action −20 −10 0 10 20 COC −2,000−1,000 0 1,000 2,000 −2,000 −1,000 0 1,000 2,000 x (m) y(m) Ownship action COC −2,000−1,000 0 1,000 2,000 −2,000 −1,000 0 1,000 2,000 x (m) Intruder action −20 −10 0 10 20 Approach 31
32. 32. Multithreat coordination multi-agent pairwise joint policy decompose pairwise solutions + coordination = ac1: left ac2: right ac3: straight Approach 32
33. 33. Utility fusion idea: combine pairwise utilities Qij to form proxy for global utility Q Qij operates on the pairwise MDP state-action pair (sij, aij) for aircraft i and j fusion strategies π(s) = argmax a∈A Q(s, a) – max-min: “worst-case” Qmin (s, a) = min i<j {Uij(sij, aij)} – max-sum: “average” Qsum (s, a) = i<j Uij(sij, aij) problem: action space A scales exponentially with number of aircraft Approach 33
34. 34. Search heuristic: Alternating maximization maximize global utility by varying one action and ﬁxing the rest e.g., ﬁx drone 1 and 2’s actions (R and L) and vary drone 3’s action 1.1 1.3 0.9 2.1 1.1 0.9 1.3 2.3 0.8 2.3 1.4 0.3 0.9 1.7 0.8 0.8 2.2 1.4 2.1 1.1 0.3 argmax pair 1 pair 2 L R S L R S L R S L R S aircraft 3 action: L Q13(s,a)! Q23(s,a)! min Approach 34
35. 35. Distributed variant I: Parallel search greedy local maximization 0 L 0 0 0 L R 0 0 broadcast action R L L R L L R L L repeat until convergence Approach 35
36. 36. Distributed variant II: Consensus search parallel alt. maximization R R L R L L R L R broadcast joint action R R L R L L R L R greedy local maximization R L L R R L R R L incorporate msgs R ? L R R ? ? R L Approach 36
37. 37. Consensus search message parallel search could incur large communication cost messages reduce communication between compute nodes, but risk yielding unsynchronized joint action across aircraft incorporate messages by “averaging” what intruders think ownship should do to approximate consensus broadcast joint action R R L R L L R L R incorporate msgs R L? L R R 0? R? R L Approach 37
38. 38. Joint policy visualization triple aircraft encounter max-min (top) and max-sum policies (bottom) COC −2,000−1,000 0 1,000 2,000 −2,000 0 2,000 x (m) y(m) First aircraft action −2,000−1,000 0 1,000 2,000 −2,000 0 2,000 x (m) Second aircraft action −2,000−1,000 0 1,000 2,000 −2,000 0 2,000 x (m) Third aircraft action −20 −10 0 10 20 COC −2,000−1,000 0 1,000 2,000 −2,000 0 2,000 x (m) y(m) First aircraft action COC −2,000−1,000 0 1,000 2,000 −2,000 0 2,000 x (m) Second aircraft action COC −2,000−1,000 0 1,000 2,000 −2,000 0 2,000 x (m) Third aircraft action −20 −10 0 10 20 Approach 38
39. 39. Outline Introduction Mathematical formulation Approach Numerical experiments UTM-α: A distributed framework Conclusion and future work Numerical experiments 39
40. 40. Encounter model generate uniformly random starting aircraft states in annulus – speeds uniformly random – headings initialized to point to annulus center – resample if loss of separation Numerical experiments 40
41. 41. Aircraft model ODE solver for dynamics – continuous Dubin’s kinematic model – Gaussian noise in acceleration and banking corrective bank angles mapped to PID control policies Numerical experiments 41
42. 42. Baseline methods closest threat – multithreat scenario decomposed into pairwise encounters – aircraft execute solution to encounter with closest intruder uncoordinated – aircraft assumes all other aircraft are white-noise intruders – executes greedy local maximization to ﬁnd own action Numerical experiments 42
43. 43. Experiment >1 million simulations from 2 to 10 aircraft – code: Julia implementation – machine: 3.4 GHz Intel i7 processor with 32 GB RAM ∼10 ms decision times (vs. 5 s decision period) – serial solve times of <10 ms for up to 10 aircraft suﬃcient for server-based resolution system – unoptimized code that doesn’t account for communication costs for distributed variants Numerical experiments 43
44. 44. Safety performance max-min utility fusion 4 5 6 7 8 9 10 6 7 8 9 ×10−3 number of aircraft conﬂictprobability uncoord centralized parallel consensus max-sum utility fusion 4 5 6 7 8 9 10 3 4 5 6 ×10−2 number of aircraft conﬂictprobability uncoord centralized parallel consensus distributed variants perform as well as serial version coordinated methods >10 times better than closest threat heuristic coordinated methods ∼10% better than uncoordinated method Numerical experiments 44
45. 45. Safety performance takeaways no near mid-air collisions (NMACs) in simulations – NMAC threshold distance deﬁned at 30 m (100 ft) – due to penalty on smaller separation distances even in conﬂict simulations validate all algorithmic variants – distributed variants perform as well as serial version – coordinated methods signiﬁcantly better than baselines Numerical experiments 45
46. 46. Trade-oﬀ: Safety vs. alert rate vary relative penalty between loss of separation and disruption best performance at “knee” of Pareto frontier 0 2 4 6 ×10−2 1.8 1.9 2 2.1 2.2 ×10−2 conﬂict probability alertrate centralized parallel consensus Numerical experiments 46
47. 47. Outline Introduction Mathematical formulation Approach Numerical experiments UTM-α: A distributed framework Conclusion and future work UTM-α: A distributed framework 47
48. 48. Practical conﬂict avoidance recall: pub-sub system that UTM clients subscribe to for advisories – standalone system that subscribes to UTM server for ﬂight tracking goals: robustness, scalability, and modularity – cluster computing framework for large-scale data processing – streaming analytics for real-time conﬂict resolution – built-in system fault-tolerance UTM-α: A distributed framework 48
49. 49. System design UTM-α: A distributed framework 49
50. 50. System design why Kafka? – designed as uniﬁed, high-throughput, low-latency platform for handling real-time data feeds – open source with active industry use (originally developed by LinkedIn) why Spark? – easy, high-level API – ease of operations with minimal fuss over fault-tolerance – enormous, active community and industry deployments (CISCO, Netﬂix, Intel,. . . ) UTM-α: A distributed framework 50
51. 51. Driver-worker model driver-worker model for conﬂict resolution – driver loads up policy object that contains lookup table and algorithm for conﬂict resolution – workers receives policy object via broadcast and subsequently delegated tasks by driver formatted conﬂict data stored as an “inﬁnite” stream of resilient distributed datasets (RDDs) – distributed, partitioned collection of conﬂict objects (JSON) – automatically rebuilt on failure UTM-α: A distributed framework 51
52. 52. Pub-sub model ingestor publishes conﬂict objects to conflict Kafka topic – subscribes to and ingests UTM server stream and crudely identiﬁes potential conﬂicts by proximity – formats conﬂict data as JSON strings advisor publishes advisories to advisory Kafka topic – producers are worker nodes in cluster generating advisories – consumers are UTM operator clients ﬂying drones UTM-α: A distributed framework 52
53. 53. Working model Spark driver launches workers/executors and broadcasts policy object workers receive conﬂict data in parallel from conflict Kafka topic as RDD stream workers publish advisories to advisory Kafka topic for UTM operator clients UTM-α: A distributed framework 53
54. 54. Outline Introduction Mathematical formulation Approach Numerical experiments UTM-α: A distributed framework Conclusion and future work Conclusion and future work 54
55. 55. Summary three variants of coordination-based conﬂict resolution algorithm – 1 centralized/serial and 2 distributed variants millisecond solve times for real-time application for multiple aircraft robust distributed system for practical conﬂict avoidance Conclusion and future work 55
56. 56. Implementation policy generation on Julia scientiﬁc programming language https://github.com/sisl/ConflictAvoidanceDASC distributed system with Apache Kafka and Spark Streaming https://bitbucket.org/sisl/utm-alpha Conclusion and future work 56
57. 57. Further research tracking aircraft via belief states – use the UTM client server to track reported aircraft states – generate local set of belief states for all aircraft to better estimate state uncertainty in real-time best action for drone in case of communication loss – ﬁgure out the response state of the aircraft via belief state tracking – where to go in case of communication loss, what ﬂight proﬁle to take, and what UTM should assume about the drones ﬂight dealing with adversarial ﬂights Conclusion and future work 57
58. 58. Further software development standardize resolution advisory and advisory receipt formats with focus on minimizing message size quantify delay times for sending packets between advisory server and clients in order to model it into the problem Conclusion and future work 58
59. 59. References Federico Augugliaro, Angela P Schoellig, and Raﬀaello D’Andrea. Generation of collision-free trajectories for a quadrocopter ﬂeet: A sequential convex programming approach. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1917–1922, October 2012. J. P. Chryssanthacopoulos and M. J. Kochenderfer. Decomposition methods for optimized collision avoidance with multiple threats. Journal of Guidance, Control, and Dynamics, 35(2):398–405, 2012. International Civil Aviation Organization ICAO. Surveillance, radar and collision avoidance. International Standards and Recommended Practices, 4, 2007. 59
60. 60. References M.J. Kochenderfer and J.P. Chryssanthacopoulos. Robust airborne collision avoidance through dynamic programming. Project Report ATC-371, MIT Lincoln Laboratory, 2011. J. K. Kuchar and L. C. Yang. A review of conﬂict detection and resolution modeling methods. IEEE Trans. Intell. Transp. Syst., 1(4):179–189, 2000. H. Y. Ong and J. C. Gerdes. Cooperative collision avoidance via proximal message passing. In American Control Conference (ACC), July 2015. Tom Schouwenaars, Mario Valenti, Eric Feron, and Jonathan How. Implementation and ﬂight test results of MILP-based UAV guidance. In IEEE Aerospace Conference, pages 1–13. IEEE, 2005. 60
61. 61. Impact of search timeout conﬂict decreases with iteration limit but with diminishing returns 2 3 4 5 6 4.8 4.9 5 5.1 5.2 5.3 ×10−3 iteration limit conﬂictprobability 61
62. 62. Restriction messages for consensus search 2 4 6 8 10 0.5 1 1.5 2 2.5 3 3.5 ×10−3 number of uavs conﬂictprobability consensus (maxmin) restriction (maxmin) instead of broadcasting joint action, aircraft broadcast restriction messages that indicate what other aircraft shouldn’t do – e.g., reward straight path when it receives no left and no right messages at a decision period 62