Upcoming SlideShare
×

# Automated design of multiphase space missions using hybrid optimal contro (1)

481 views

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
481
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
5
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Automated design of multiphase space missions using hybrid optimal contro (1)

2. 2. I. Introduction MANY interesting problems in numerical optimization are hybrid optimal control (HOC) problems. HOC problems include both continuous-valued variables and categorical variables in the problem formulation. For the types of problems envisioned here, the categorical variables will specify the structure or sequence of events that qualitatively describes a space trajectory or mission. For example, for an interplanetary spacecraft trajectory, a mission could be described by the sequence of categorical variables: Earth departure, low-thrust heliocentric arc, Mercury arrival. An equally valid and perhaps lower-cost sequence might be: Earth departure, low-thrust heliocentric arc, coast arc, low-thrust heliocentric arc, Mercury arrival. The identities of the mission events and their order in the sequence are the discrete or categorical variables of the HOC problem. The time histories of the spacecraft position and velocity and of the spacecraft control (thrust-pointing angle) are the continuous-time variables of the HOC problem. The cost associated with a particular event sequence is found from the solution of the corresponding continuous optimal control problem. Finding the sequence of events with minimum cost requires searching a discrete space composed of the sequences resulting from the permutation of the mission events. Even by considering a catalog with only a few events, there may be several thousand possible mission designs. A simple approach to find the best mission structure is to perform a total enumeration of all the possible sequences. Although intuition on the part of the mission planner can reduce the size of the search space, total enumeration would still consume a significant amount of resources. A solver is needed for this type of problem, i.e., finding the best mission structure without performing total enumeration, andwith little or no intuition supplied (or required) on the part of the mission planner. Oneway to reduce the discrete search space is to apply pre-pruning with respect to some criteria; in this way, the remaining mission sequences are the only ones considered of interest [1]. Vasile et al. [1] also present an approach to automated mission planning, in which the mission is composed of elementary blocks that represent events such as Lambert maneuvers and flybys. Englander et al. [2] and Gad and Abdelkhalik [3] presented methods that find the flyby sequence and the optimal trajectory using genetic algorithms (GA). In their work, the governing dynamics do not change during the trajectory legs between flybys, i.e., such legs can be modeled with a single phase. Ceriotti and Vasile [4] propose the use of a modified ant colony optimization (ACO) algorithm for the automated mission planning using flybys and impulsive maneuvers. In this work, the problem of interest is the automated solution of HOCproblems that consistof different events or phases. Theproblem consists of finding the sequence composed of an unspecified number of events (coast arcs, thrust arcs, and impulses) that minimizes propellant consumption for a space trajectory with given initial and boundary conditions, conducted in free or fixed final time. A recent approach to solving HOC problems is to use two nested loops: an outer-loop that handles the discrete dynamics and finds a solution sequence in terms of the categorical variables, and an inner-loop that performs the optimization of the continuous-time dynamical system and obtains the required control history. Ross and D’Souza [5] present a general framework for the description of HOC problems and the corresponding mathematical formalism. The nested-loop approach is qualitatively similar to the solution of a discrete optimization problem using GA [6], where the objective is usually some known, analytic function of the discrete parameters of the GA that can be evaluated directly. In the HOC context, however, the cost cannot be found until an optimal control problem (with important event parameters supplied by the GA decision vector) is solved. This is much more difficult and time-consuming, e.g., analytic functions always return a cost value, but the routine used to solve the inner-loop problem inherits the difficulties associated with solving optimal control problems. The inner-loop solver finds the optimal trajectories for the continuous-time dynamical systems associated with the event sequences generated by the outer-loop. Note that distinct event sequences can constitute quite different optimal control problems. For instance, the equations of motion during a coast arc event are not the same as the equations of motion during a continuous thrust arc event. Similarly, the control parameters change in type and number; an impulse is defined by magnitude and direction parameters, while a thrust arc needs a flight time parameter and a continuous thrust- pointing angle (or angles). Because the outer-loop requests the evaluation of different sequences during the search, this work presents a scheme that allows the inner-loop to solve trajectory optimization problems with variable structures without a priori knowledge or experience. This is a challenging aspect of the problem because the determination of just a single optimal space trajectory with (possibly) multiple coast arcs and thrust arcs has not been considered a simple or straightforward problem. The optimal trajectory for a particular mission sequence is found using direct transcription with nonlinear programming (NLP) [7]. As a gradient-based optimization method, NLP requires an initial guess, which after several numerical iterations by the NLP problem solver should converge to a solution, i.e., satisfy specified tolerances on feasibility and optimality. Traditionally, the mission planner has had to resort to intuition and experience to generate such initial guesses. For example, solutions to Lambert’s problem [8] can provide good initial guesses for impulsive trajectories. If the solution process (outer-loop inner-loop) is to be automated, methods that generate high-quality guesses are needed to yield a robust behavior of the NLP problem solver. A new method was developed to generate the approximate low-thrust trajectories to be used as the initial guesses for the NLP solvers. The method, based on GA, directly approxi- mates optimal control histories by incorporating boundary conditions explicitly using a “conditional penalty” (CP) function [9]. II. Hybrid Optimal Control Problem The solution to the HOC problem consists of finding values for a set of discrete variables that minimize a cost function resulting from the trajectory optimization of a continuous-time dynamical system. One example of a HOC problem is the motorized traveling salesman [10]. The salesman drives a car in which the motion dynamics are described by a system of differential equations. He must visit a number of cities whose locations are specified in the problem. The objective is to find the ordered sequence of cities and the corresponding optimal trajectory that minimizes the travel time of the salesman. The identities of the cities and their order of visitation are the discrete or categoricalvariables of the HOC problem.The time histories of the car position and velocity and of the car controls (acceleration and turn rate) are the continuous-time variables of the HOC problem. This section presents the general formulation of a HOC problem using the formalism introduced by Ross and D’Souza [5]. A. Discrete Dynamics The discrete events represent the qualitative states or phases of a HOC problem. They can be grouped together into a categorical space Q of finite cardinality NQ ∈ N. The task of the mission planner is to assemble a sequence q of events q ∈ Q that fulfills the mission objectives and minimizes a cost function defined in the inner-loop. The construction of the sequence q incorporates constraints that form the finite dynamics of the problem. Such constraints can be modeled with a finite state automaton in the form of a directed graph or digraph as shown in Fig. 1. The nodes and the directed edges constitute the events and the allowed transitions between them, respectively. From the digraph, the categorical state space is Q fqa; qb; qcg with cardinality NQ 3. The subscript notation in qi identifies the state in the categorical space. To encode the information contained in the digraph, let a switching set be defined as the transition between states at time ts ∈ R S q; q0 f x; u; x0; u0; ts g (1) 2 AIAA Early Edition / CHILAN AND CONWAY DownloadedbyUNIVERSITYOFLIVERPOOLonAugust29,2013|http://arc.aiaa.org|DOI:10.2514/1.58766
3. 3. where x ∈ RNx and u ∈ RNu are the continuous state and control vectors, respectively. If a transition is allowed from the event q to the event q0, then the switching set S q; q0 ≠ ∅; otherwise, the switching set is empty. All the possible switching sets can be encoded in the NQ × NQ adjacency matrix A, where Aij 1 if S qi; qj ≠ ∅ 0 if S qi; qj ∅ (2) Feasible event sequences can be obtained from the digraph. One example of such a sequence is q qa; qb; qa; qc; qa where the number of switches Ns 4. The digraph also shows transitions that are not allowed. For example, the automaton cannot move from the state qc to qb because there is not a directed edge in that direction. Thus, the switching set S qc; qb is empty. The number of events in a sequence, given by Ns1 Ns 1, is not fixed. In practice, however, an upper bound Ns;max on the length of the feasible sequences is given to constrain the problem to finite sequences and to reduce the computational complexity of the problem. Let Q be a 1 × NQ matrix whose components are the elements qi ∈ Q. Letalso the operation be definedoverthe Cartesian product Q × f0; 1g in the following way: q 0 ∅ and q 1 q; ∀ q ∈ Q Also, let D be defined by Dn×m ≡ n Δ ∈ f0; 1gn×m∶ Pn i 1 Δij ∈ f0; 1g ∀ j 1; : : : ; m o (3) where the column-sum property implies that the columns are composed mostly of zeros with only a single component equal to 1. The framework proposed originally does not consider the case where an event must not appear more than once in the solution sequence [5]. This constraint can be applied by requiring that D have full column rank. Finally, by including the following definitions: q ∅ q ∅ q; ∅ ∅ ∅ the transition map Q → q can be characterized through the discrete controller matrix Δ with Δ ∈ DNQ×Ns1 , as a matrix operation that generates an event sequence of length Ns1 q Q Δ (4) In constructing the discrete controller Δ, it is important to recall that not all transitions between the events of the categorical set are allowed. These constraints are encoded in the adjacencymatrix A as a model for the finite state automaton. Therefore, the controller Δ must satisfy Δi;j ∈ fAki; 0g for Δk;j−1 1; i 1; : : : ; NQ; j 1; : : : ; Ns (5) Let the superscript notation qj−1 identify the event in the j place in the sequence. Then, the condition in Eq. (5) indicates that if the current event qj−1 is qk, then the next event qj can be qi if Aki is equal to 1. If this were the case and the next event was chosen to be qi, then Δk;j−1 1; and Δi;j 1 where the rest of the elements in the columns j-1 and j of Δ are zero due to the column-sum property in Eq. (3). The first and last columns of Δ are not directly constrained by the adjacency matrix A, but they must satisfy discrete boundary conditions. Let Q0 ⊆ Q be the set of all the allowed initial events, and Qf ⊆ Q be the set of all permissible final events. Then, the initial event q0 of all feasible sequences belongs to Q0. In the same way, the last event qNs of all feasible sequences belongs to Qf. Finally, let UD ⊆ D be the set of discrete controllers Δ that satisfy adjacency and boundary constraints. Then, the problem to be solved by the outer-loop is a feasible integer programming (FIP) problem that can be stated as follows: Find Δ; Ns1 subject to Δ ∈ UD ⊆ DNQ×Ns1 ; Ns1 ≤ Ns;max ∈ N (6) Assuming that the inner-loop handling the continuous-time dynamics can find the optimal trajectory for any feasible sequence, then each candidate q will have an associated cost. The objective of the outer-loop solver is finding the sequence of events that has the optimal cost among all the feasible sequences with length Ns1 ≤ Ns;max. The complexity of the search space for the FIP is NQ Ns1 because Δ ∈ D. The fact that this type of problem is NP-complete [5,10] underscores the impracticality of total enumeration as a means to find the optimal feasible sequence. A more efficient approach known as branch-and-bound optimization has been used in the outer-loop to solve the FIP [10,11]. This work introduces use of the GA for the same purpose [12]. The GA [6] method is a relatively new technique (in comparison with the calculus of variations or primer vector theory [13]) that has been successfully applied to trajectory optimization problems [14,15]. The GA requires a population of individuals; an individual is a set of values for optimization parameters that is encoded as a string of binary digits (a chromosome). In a HOC problem, a typical string might consist of binary representations of states or events in a sequence. GA methods have features that make them appealing for use in an automated mission planner. For instance, planning intuition in the form of an incumbent solution sequence is not required. Although GA does not ensure that the solution found represents a global optimum [16], the fact that the method is randomized allows the search to continue even when a local minimum is found, unlike gradient-based methods. B. Continuous-Time Dynamics A modern space mission is usually composed of several events. Some of them have finite duration, such as thrust arcs, while others are instantaneous, such as impulsive maneuvers. The dynamics governing the system during any of the events q ∈ Q is defined by a set of differential equations _x f x; u; t; q (7) where x ∈ RNx and u ∈ RNu are the continuous state and control vectors, respectively. The dependence on q means that the system dynamics may change throughout the mission, e.g., a model could require switching from a thrust arc to a coast arc. Let q q0; q1; : : : ; qNs be a finite sequence of events where qj ∈ Q for j 0; 1; : : : ; Ns. Let t t0; t1; : : : ; tNs1 be a qa qc qb Fig. 1 Digraph for a finite state automaton. AIAA Early Edition / CHILAN AND CONWAY 3 DownloadedbyUNIVERSITYOFLIVERPOOLonAugust29,2013|http://arc.aiaa.org|DOI:10.2514/1.58766
4. 4. real-valued matrix associated with q. Then, the cost functional is given by J x : ; u : ; t; q; Δ; Ns XNs j 0 ϕ x tj 1 ; tj 1; qj Z tj 1 tj L x t ; u t ; qj dt (8) where ϕ and L correspond to the Mayer and Lagrange costs of each phase. As with most optimal control problems of significant complexity, direct methods are preferred to avoid the difficulties associated with the derivation and solution of the two-point boundary value problem (TPBVP) resulting from the Euler–Lagrange conditions [17] of the calculus of variations (COV). There are two principal schemes that directly transform continuous-time optimal control problems into NLP problems: collocation and transcription. Both cause the equations of motion (EOM) to be satisfied by defining nonlinear constraint equations. Direct collocation describes such constraint equations with implicit rules, such as Hermite–Simpson [18] and Gauss–Lobatto [19], while direct transcription uses explicit rules, such as the fourth-degree Runge–Kutta (RK) method [7]. Given that NLP is a gradient-based method, it requires an initial guess of the solution, i.e., the vector of solution parameters, which can be obtained by intuition or experience. The robustness and accuracy of the method depends on the selected resolution level of the time mesh, the collocation or transcription scheme, and the quality of the initial guess, which means how closely the guess satisfies the constraints and the optimality conditions. For the implementation of the inner- loop solver, direct transcription with RK integration rules and a parallel-shooting scheme [7,12] is selected. The NLP parameters can then be arranged as a singlevector PT that collects all the continuous variables. For example, the parameter vector for a trajectory consisting of a single thrust arc becomes PT ZT where ZT xT 1 ; uT 1 ; xT 2 ; uT 2 ; : : : ; xT N 1; uT N 1; tf , xi ∈ RNx and ui ∈ RNu for i 1; 2; : : : ; N 1 are parameter vectors that represent the state and control variables at the nodes of the discrete time mesh, and N is the number of segments of the mesh [7]. In the same manner, the nonlinear constraints can be collected into a vector CT. The optimal control problem can then be restated as a NLP problem of the form: Minimize J P subject to bL ≤ 8 < : P AP C P 9 = ; ≤ bU where AP is formed by all the linear constraints of the problem, and bL and bU are the lower and upper bounds of the parameters and constraints. The upper and lower bounds for the great majority of the nonlinear constraints C P are usually set to zero, because this forces the solver to choose values for the parameters that satisfy the EOMs when they are integrated forward using the RK rule within each segment (there may be a small number of additional nonlinear constraints, e.g., boundary conditions). Once the NLP problem is clearly defined, it can be solved using dense or sparse solvers such as NPSOL and SNOPT [20]. SNOPT is selected because it can take advantage of the sparsity present in the constraint Jacobian [18]. III. Multiphase Mission Design as a Hybrid Optimal Control Problem The validity of the proposed GA NLP approach for the implementation of hybrid optimizers was shown by solving two sample problems: the motorized traveling salesman [12] and the interception of multiple asteroids [21]. However, these problems contain several simplifications that make them qualitatively different from multiphase missions. For instance, the categorical variables represent targets such as cities and asteroids, and the length of the categorical sequence is constant. This allowed a straightforward definition of the GA chromosome and a static NLP structure, i.e., the structure remained the same for any sequence, requiring only changing thevalue of the interception constraints in the inner-loop. In a multiphase mission, the categorical variables represent events such as coast arcs and thrust arcs, and the length of the mission sequence is variable. Therefore, approaches to accommodate variable-length sequences and to generate different NLP problem structures dynami- cally are needed. The problems of interest here are missions composed of an unspecified number of events with given initial and boundary conditions, and free or fixed final time. The discrete and continuous- time problems associated with the mission design will be described along with the proposed methods of solution. A. Discrete Dynamics The discrete events in a space mission plan can be maneuvers such as impulses, thrust arcs, and coast arcs, which can be grouped together into a categorical space Q of finite cardinality NQ ∈ N. Although it appears that only these three types of events need to be defined, considerations regarding the robustness of the trajectory optimization in the inner-loop, to be described in the following sections, warrant the definition of additional, more specifically defined events as shown in Table 1. Therefore, the categorical space becomes Q fq0; q1; q2; q3; q4g fc; i; s; l; tg (9) with cardinality NQ 5. The subscripts identify the event within the categorical space. The goal is to assemble a sequence q of events q ∈ Q that fulfills the mission objectives and minimizes a cost function defined in the inner-loop. The construction of the sequence q incorporates constraints that are consistent with the discrete dynamics of the problem. For instance, to satisfy state boundary conditions, only a Lambert’s rendezvous l or a boundary-specified thrust arc t can be the final event of the sequence [9]. Also, a coast is placed at the beginning of the mission and between thrusting maneuvers [9]; for example, sequences of the form i; i or s; s are not allowed. These constraints help reduce the size of the discrete search space. The discrete constraints can be modeled with a finite state automaton in the form of a directed graph or digraph, as shown in Fig. 2. The nodes and the directed edges constitute the mission events and the allowed transitions between them, respectively. The directed edge that does not start from a node indicates that the sequences start with event c; the double-circled nodes specify that the sequences end with events l or t. These constraints form the discrete boundary conditions Q0 fcg (10) Table 1 Categorical events for mission design Event (code) Description Coast arc c A finite coasting arc Impulse i An instantaneous impulsive maneuver Boundary-free thrust arc s A thrust arc without given boundary conditions, subject only to feasibility with respect to the EOMs Lambert’s rendezvous l A rendezvous composed of an impulse, a coast arc, and another impulse that satisfies given boundary conditions Boundary-specified thrust arc t A thrust arc with given boundary conditions 4 AIAA Early Edition / CHILAN AND CONWAY DownloadedbyUNIVERSITYOFLIVERPOOLonAugust29,2013|http://arc.aiaa.org|DOI:10.2514/1.58766
5. 5. Qf fl; tg (11) Feasible event sequences can be obtained from the digraph. An example of a feasible sequence is q q0 ; q1 ; q2 ; q3 ; q4 ; q5 ; q6 ; q7 c; s; c; i; c; s; c; t where the number of switches Ns 7. The superscripts specify the place of the event in the sequence. The digraph also shows transitions that are not allowed. For example, the automaton cannot move from the event i to event s, because there is not a directed edge in that direction. The information contained in the digraph can be encoded following the notation and analysis shown in Sec. II for the definition of the switching sets and the adjacency matrix, which becomes, for the digraph of Fig. 2 A 2 6 6 6 6 4 0 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 7 7 7 7 5 (12) The number of events in a sequence, given by Ns1 Ns 1, is not fixed. An upper bound Ns;max is given to constrain the problem to finite sequences and to make the problem tractable. Let F be the set of all the feasible sequences, i.e., event sequences that satisfy the adjacency constraints and the discrete initial and boundary conditions. The problem to be solved by the outer-loop can then be stated as follows: minimize q ϕ q subject to q ∈ F; Ns1 ≤ Ns;max ∈ N The evaluation of the function ϕ is carried out by the inner-loop solver, which will provide the outer-loop with the optimal cost of the trajectory corresponding to the argument sequence. B. Continuous-Time Problem The continuous-time problem of multiphase mission design is to find the optimal trajectory that minimizes the objective function, usually propellant consumption or time offlight, satisfies giveninitial and boundary conditions for the continuous-time state variables, and has a given mission structure described by an event sequence. The total mission time can be specified or free. Because a multiphase trajectory can use an electric engine for providing a continuous thrust arc and a chemical rocket for impulses, parameters such as the initial thrust acceleration α0, and the exhaust velocities of the propulsion engines must be given. Using polar coordinates on a spacecraft-fixed basis, the equations of motion become [9] _r vr _θ vθ r _vr v2 θ r − 1 r2 f q α sin β _vθ − vθvr r f q α cos β _α f q α2 ce (13) where f q 0 if q c 1 if q ∈ fs; tg α is the thrust acceleration, ce is the exhaust velocity of the electric engine, and β is the control angle describing the direction of the thrust. Because the thrust magnitude provided by the electric engine is constant, and the rate of change of the thrust acceleration appears in the EOMs governing the finite events, it is convenient to use the thrust acceleration as a measure of the spacecraft’s mass throughout the trajectory. An impulsive maneuver is modeled as changing both mass and velocityinstantaneously.An equation relating such change isgiven in [22] m− − m m− 1 − e−ΔVc cc (14) where cc is the exhaust velocity of the chemical rocket, and the superscripts − and refer to instants immediately before and immediately following the event. Then, m− − m m− 1 − α− α (15) using Newton’s second law. Substituting Eq. (15) into Eq. (14), the states must satisfy the constraints [9]: r r− θ θ− vr v− r ΔVc sin β vθ v− θ ΔVc cos β α α−e ΔVc cc (16) where ΔVc is the impulse magnitude, and β is the direction of the impulse in a spacecraft fixed basis. An analytical expression for the thrust acceleration as a function of the thrusting time te t − t− can be obtained from Eq. (13) Z α α− da a2 Z te 0 dτ ce which yields α 1 1 α− − te ce (17) Minimizing the sum of the thrusting times can optimize a mission consisting of multiple thrust arcs. However, this approach is not appropriate if electric engines with different exhaust velocities are used, because their difference in efficiency is not taken into account. For instance, consider a trajectory with n thrust arcs where ti and ci represent the thrusting time and the exhaust velocity of the engine used during thrust arc i. Then, according to Eq. (17), the final thrust acceleration due to propellant consumption is given by [9] i sl t c Fig. 2 Digraph of a finite state automaton for mission design. AIAA Early Edition / CHILAN AND CONWAY 5 DownloadedbyUNIVERSITYOFLIVERPOOLonAugust29,2013|http://arc.aiaa.org|DOI:10.2514/1.58766
6. 6. αf 1 1 α0 − Pn i 1 ti ci (18) Minimizing the sum of the normalized thrusting times ti∕ci as defined in Eq. (18) thus minimizes overall propellant consumption for the case when engines with different exhaust velocities are used. For impulsive maneuvers, the change invelocity ΔV has been used as a measure of the propellant consumed. Therefore, the total amount of propellant used in a trajectory that uses multiple impulses is described as the sum of the corresponding ΔVs. As with the low- thrust case, summing ΔVs provided by rockets with different exhaust velocities does not describe the total propellant used. Consider a trajectory with n impulses where ΔVi and ci represent the change in velocity of the spacecraft and the exhaust velocity of the rocket used during impulse i. According to Eq. (16), the total change in thrust acceleration due to propellant consumption is given by [9] αf α0e Pn i 1 ΔVi ci (19) Thus, the sum of the normalized change in velocity ΔVi∕ci is a better measure of propellant use, because values corresponding to different rockets can be added to determine the total change in mass. Optimizing a system that uses both low-thrust and chemical propulsion in the same trajectory requires a uniform metric to minimize. Equating the expressions for the thrust acceleration α in Eqs. (16) and (17) yields an expression that relates the normalized thrusting time te∕ce of the electric engine to the normalized change in velocity ΔVc∕cc provided by the chemical rocket. te ce 1 α− 1 − e−ΔVc cc (20) The mission planner can now choose which measure of propellant to minimize. In this work, the total normalized thrusting time is used as the objective. Thus, if impulsive maneuvers are used, their ΔVs are converted to normalized thrusting time using Eq. (20). Let q q0; q1; : : : ; qNs be the finite sequence of events provided by the outer-loop where qj ∈ Q for j 0; 1; : : : ; Ns. Let t t0; t1; : : : ; tNs1 be a real-valued time matrix associated with q. Then, the optimal control problem consists of minimizing the cost functional given by J x : ; u : ; t; q; Ns XNs j 0 g qj (21) where g qj 8 >>>>>< >>>>>: 0 if qj c 1 α− j 1 − e− ΔVc;j cc if qj i tj 1−tj ce if qj ∈ fs; tg P2 k 1 1 α− j;k 1 − e− ΔVc;j;k cc if qj l subject to Eqs. (13) and (16) and x t0 x0 (22a) x tNs1 xf tNs1 (22b) tj−1 ≤ tj ≤ tj 1; for j 1; : : : ; Ns (22c) t0 0 (22d) tNs1 free or fixed (22e) where x ∈ R5 and u ∈ R are the continuous state and control vectors, respectively. In view of the fact that the inner-loop solver is expected to find the optimal state trajectory for any given event sequence q, if it exists, robustness is more important than high accuracy. With this in mind, the proposed method of solution uses direct transcription with NLP using RK integration rules and a parallel-shooting scheme [7]. To maximize robustness for a given number of segments, only one RK step is used on each segment. The method used in the inner-loop to generate a good initial guess and to solve the continuous optimal control problem automatically with robustness and accuracy will be described in Sec. V. IV. Transcription of the Mission Event Sequence The discrete component of the HOC problem consists of finding the feasible categorical sequence that has minimum cost. In the mission design context, this means finding the sequence of events that achieves the mission objectives while minimizing propellant consumption. The successful solutions of the sample problems in [12] suggest the use of GA for the implementation of the outer-loop solver. In those problems, the discrete constraints were enforced by assigning a large constant cost to infeasible sequences. This approach is not practical for the multiphase problem because the discrete dynamics are more complex. A new model that transforms the constrained discrete optimization problem into an unconstrained problem [9] and searches for the solution only in the feasible discrete space ispresented. The proposedapproach also deals with categorical sequences of variable length. It is known that optimization algorithms improve their performance if the search is carried out only in the feasible space defined by the given constraints [23]. In the mission design problem the potential for improvement is significant, because the specified discrete constraints defineafeasiblespacethatismuchsmallerthanthetotaldiscretespace. Searching only in the feasible space requires the transformation of the constrained optimization problem into an unconstrained problem. The proposed GA model applies the specified constraints implicitly, allowing every individual to represent a feasible event sequence [9]. According to Fig. 2 only the thrusting events need to be described in a sequence because coast arcs are always to be placed between them and at the beginning of the sequence. Midcourse events can be modeled using a binary digit, because there are only two allowed event types (0fori,1fors).Similarly,abinarydigitcanalsorepresentthelastevent in a sequence (0 for l, 1 for t). As a result, a feasible sequence can be modeled using a binary string, in which every bit represents a thrusting event. The remaining issue is to determine how to handle sequences with a variable number of events. A fixed-size binary string can represent sequences with variable length by describing the number of thrusting events NT as the location of the leading 1-bit in the chromosome [9]. The leading 1-bit does not represent an event itself; it is just a marker stating that only the bits that followaretobetakenintoaccount.Inadditiontothesequenceevents,a categorical variable specific to mission design, to be included in the chromosome, is the number of revolutions to be performed by the spacecraft.If the maximum numberof revolutionsallowed isthree,itis necessary to add two bits to the chromosome. Figure 3 shows a sample 10-bitchromosomethatcanaccommodatesequenceswithamaximum number of thrusting events NT;max equal to 7 (1 bit for the length marker, 7 bits for thrusting events, and 2 bits for the number of revolutions). According to the convention of the previous paragraphs, the string in Fig. 3 is to be decoded as i; s; t . By adding the coast arcs that are assumed to precede each thrusting event, the resulting event sequence becomes c; i; c; s; c; t with two revolutions. Not all binary strings generated by GA are ready for decoding. For instance, consider the following possible 8-bit strings: 00000000 and 0 0 0 0 1 0 1 1 1 0 Thrusting events Number of revolutions Length marker Fig. 3 Sample chromosome for variable-length sequences. 6 AIAA Early Edition / CHILAN AND CONWAY DownloadedbyUNIVERSITYOFLIVERPOOLonAugust29,2013|http://arc.aiaa.org|DOI:10.2514/1.58766
7. 7. 00000001. The former does not have the leading 1-bit, meaning that its length is undefined, and the latter does have the leading 1-bit, but does not contain any thrusting event. This issue can be handled by preprocessing every GA-generated binary string through a binary addition to 1000 before the decoding step, so that there is at least one thrusting event, and the number of revolutions is defined. The sample binary strings would then become 00001000 and 00001001, which represent the sequence c; l with 0 and 1 revolution, respectively. The outer-loop solver for the mission design problem will use this model in which everystring generated by the GA represents a feasible sequence. The GA solver used is the MATLAB Global Optimization Toolbox [24]. V. Method for the Solution of the Continuous-Time Optimal Control Problem The continuous-time component of the HOC problem consists of finding the optimal space trajectory that satisfies initial and final boundary conditions and a given mission structure. In [12], two sample problems were solved using GA and NLP for the implementation of the outer- and inner-loop solvers, respectively. In the motorized traveling salesman problem [12], the outer-loop searched for the optimal visitation sequence of three cities. During the search the outer-loop passed each candidate city sequence to the inner-loop, which solved the corresponding continuous-time optimal control problem and returned the cost to the outer-loop. That is, the inner-loop must solve every optimal control problem required by the outer-loop, or the hybrid optimizer may not find the optimal sequence. This motorized traveling salesman problem is, however, an unusually straightforward HOC problem, because even though the inner-loop had to solve optimization problems with different visita- tionsequences, the NLPproblem structure of parameters, constraints, and system EOMs was static, i.e., it did not change. The only required modification was setting the interception constraints with the locations of the cities corresponding to the given sequence. The mission design problem is qualitatively different from these sample problems, as the discrete variables now represent events such as impulses, coast arcs, and thrust arcs that, as will be shown, change the structure of the NLP problem. In addition, the number of events in the categorical sequence is not fixed; for example a mission might consist of a coast arc, an impulse, another coast arc, and a thrust arc. The same mission might be accomplished without the impulse, or with an additional coast arc/thrust arc combination. Conventional implementations of NLP solvers expect a static problem structure. For the dynamical assembly of events required for the NLP discretization of the mission design problem, a scheme that defines events as modules consisting of parameters and constraints is presented. The method assembles the respective events sequentially in time according to the given mission structure [9]. It was noted in the discussion of the solution process for the example problems [12,21] that the continuous-time problem, after transcription, requires an initial guess,i.e., an approximate solution to initialize the NLPproblem solver.If theguess is not sufficiently good, the NLP solver will not converge, and as mentioned previously, a suboptimal solution may be found or the HOC solution search could stop. Because the mission design problems are sophisticated and challenging, combining possibly lengthy sequences of events (coasts, impulses, thrust arcs), reliably finding an approximate solution of good quality is challenging. A new method was developed that approximates optimal low-thrust trajectories [9] and addresses these issues. The method, based on GA, approximates optimal control histories by incorporating boundary conditions explicitly using a CP function. The approximate solution from this method is given as an initial guess to an NLP problem solver to obtain an accurate optimal trajectory. A. Methods for the Approximate Optimization of Multiphase Impulsive Trajectories The proposed method to approximate an optimal solution is based on real GA. Real GA still uses binary representations and operations for the evolutionary processes but at a level that is transparent to the user. The approximate optimal trajectory is to be used as the initial guess for a more accurate optimization method, such as direct transcription with NLP. Because experience shows that penalty methods do not handle explicit constraints well in general, such constraints should be handled implicitly whenever possible, i.e., posing the optimization problem in such a way that it appears to be unconstrained from the standpoint of the GA. For instance, a compound event named “Lambert’s rendezvous” has been defined for trajectories that must satisfy rendezvous conditions using impulsive maneuvers. Its implementation is based on algorithms for Lambert’s problem, which consist of the determination of an orbit that connects two position vectors and has a specified transfer time. Battin [8] and Prussing [25] present algorithms for Lambert’s problem using single and multirevolution trajectories, respectively. Such algorithms yield the semimajor axis and eccentricity of the transfer orbit, which is sufficient to allow the determination of the velocity vectors at the beginning and the end of the transfer. Although the original definition of Lambert’s problem does not consider rendezvous maneuvers, it is possible to match required velocity vectors by computing the respective ΔVs using the resulting terminal velocity vectors. Because the interception or rendezvous constraints are handled by the Lambert’s problem algorithm, the GA can find optimal values for the respective parameters for the states and transfer time without explicitly dealing with any constraint. Foran entirely impulsive trajectory, the following eventsconstitute the categorical space Q: 1) A coast arc c is the most basic event to represent. It is characterized by only one GA parameter, the flight time. Its evaluation consists of integrating the EOMs in Eq. (13) from the initial state of the event for the duration of the flight time. The state at the end of the integration becomes the boundary state of the event. 2) An impulse i is also straightforward; it consists of two GA parameters, the direction, and magnitude of the impulse. Its evalua- tion applies the vector operation in Eq. (16) to the initial state of the event. The resulting vector is the boundary state of the event. 3) The Lambert’s rendezvous l is a compound event consisting of an impulse, a coast arc, and another impulse that is placed at the end of the mission to satisfy given boundary conditions. Although this compound event is equivalent to the sequence i; c; i , the Lambert’s rendezvous event was introduced because the coast arc c and the impulse i events would need explicit constraints to satisfy the given boundary conditions. The only GA parameter required is the transfer time, because the initial and target state vectors for the event are specified. The evaluation of the event yields the terminal velocities and hence the impulses required to perform the maneuver. Assembling the respective events, in this case, i, c, and l, sequentially in time constitutes a multiphase mission. Given that a few parameters represent each event, a GA individual can characterize an entire mission by collecting the parameters of all the constitutive events. For example, the sequence q q0; q1; q2; q3 c; i; c; l can be represented by the chromosome Coast Time0; Direction1; Magnitude1; Coast Time2; Transfer Time3 where the superscripts identify which event in the sequence the respective parameter belongs to. The cost determination of this individual starts by evaluating each of the component events from left to right, successively. Every event has a defined initial state at the moment of evaluation that corresponds to the boundary state of the previous event. The assessment of the last event concludes the evaluation of the individual and precedes the computation of the sequence cost. Forthe sample chromosome, it would be the sum of all the explicit impulse magnitudes, in this case just magnitude1, and the two impulse magnitudes resulting from solving Lambert’s problem. The addition of events that use low-thrust propulsion is more complex because they require continuous control histories. The next section presents a method to approximate optimal trajectories including these types of events. AIAA Early Edition / CHILAN AND CONWAY 7 DownloadedbyUNIVERSITYOFLIVERPOOLonAugust29,2013|http://arc.aiaa.org|DOI:10.2514/1.58766
8. 8. B. Conditional Penalty Method for the Approximate Optimization of Multiphase Low-Thrust Trajectories A GA-based approach, the CP method [9], has been developed for the approximate optimization of multiphase low-thrust trajectories. In principle, any evolutionary algorithm such as particle swarm algorithms or genetic algorithms can be used with the CP formulation. Instead of using traditional penalty methods to satisfy explicit boundary conditions, a conditionally valued fitness function is introduced to first find a feasible trajectory, and then refine it into a trajectory that is also optimal. The approximate solution can then be used as an initial guess for a direct solver that converts the continuous optimal control problem into a NLP problem. The method inherits the parallel scalability of GA for use in parallel computing systems [26]. An important feature of this method is that it is not constrained to a particular choice of coordinate system. Also, the method allows considering actual mission features, such as the use of constant thrust, and the thrust acceleration increase as propellant is consumed. A space mission can be accomplished successfully by thrusting for the entire duration of the flight. However, primer vector theory shows that such a strategy is likely not optimal. Trajectories consisting of multiple low-thrust and coasting arcs, i.e., multiphase missions, can often satisfy similar boundary conditions using less propellant. The conditional penalty method described here optimizes this multiphase type of trajectory without a priori knowledge. It is illustrated by means of a sample minimum-propellant rendezvous using low thrust, whose state trajectory is shown in cartoon form in Fig. 4. Assuming that initial conditions are given, the state point A is specified. For a rendezvous, the state point D is also given or is determinable as a function of time. Intermediate points B and C are not specified directly but can be obtained by integrating the system with a control history during the first thrust arc and by coasting, respectively. Assuming that the mission events can be evaluated in chronological order, some general observations can be made at this point: 1) Every arc has initial conditions before its evaluation. 2) The thrust arc AB does not have terminal boundary conditions. 3) The thrust arc CD has given boundary conditions. 4) The flight times of each arc, including the coast, are free. A GA method can then optimize this mission by finding values for 1) control parameter vector uAB, 2) control parameter vector uCD, 3) flight time of thrust arc AB, 4) flight time of coast arc BC, and 5) flight time of thrust arc CD, that satisfy the rendezvous conditions while minimizing the total thrusting time. The control vectors consist of parameters that determine the thrust-pointing angle history during each thrust arc. The integration of the EOMs in a thrust arc requires many integration steps, which in turn calls for a rather large number of parameters to represent the respective control history. Such a large number of parameters in the GA model of an individual wouldrequire the use of very large populations, reducing the effectiveness of the method. A reduction in the number of parameters can be accomplished by representing the control history at only a few points in time. Interpolating the few control parameters using polynomials such as the Hermite cubic or a Fourier transform [9,27,28] yields the higher time resolution of the control history required for an accurate integration. A design of a chromosome for GA individuals, for the example shown in Fig. 4, would require the following vector P of parameters: P tAB; uAB; tBC; tCD; uCD The determination of the cost of each individual is accomplished by evaluating the mission phases successively using the parameter values. For instance, the first thrust arc has a defined initial point A that, with flight time tAB and control vector uAB, are used to obtain point B. Then,point B is used with tBC to find point C at the endof the coast. Finally, the evaluation of the second thrust arc requires the integration of the system starting from C, using flight time tCD and control vector uCD, to obtain point x tD . Although this approach ensures every individual satisfies the EOMs throughout the trajectory, the specified terminal conditions may not be achieved. An approach to address this issue is to use penalty methods in the fitness function. For example, the constrained optimization problem Minimize J P subject to hi P 0; i 1; 2; : : : ; n would become minimize J P k Xn i 1 Φ hi P where Φ is the penalty function, and k is the penalty coefficient. Experience shows that using a linear combination to minimize the cost and constraint violations simultaneously is not an effective technique [29]. Because the inner-loop in the HOC problem solver should be able to optimize a trajectory with any structure, an alternate, more robust method is needed. Coello Coello [29] presents a method that handles the constraints based on evolutionary multiobjective optimization (MOO). The method explicitly ranks the GA individuals in the population according to prescribed rules on their feasibility and optimality. The CP method [9] used in this work has a simpler implementation, but it is effective in finding solutions to constrained optimization problems. In the CPmethod, the constraint violations are mapped into a scalar distance d, i.e., the Euclidean distance between the boundary state x tD and the boundary conditions D. Figure 5 shows two boundary- specified thrust arcs identified by subscripts (1, 2) corresponding to the last thrust arcs of multiphase missions represented by different GA individuals. The last thrust arcs for individuals 1 and 2 terminate at x1 tD and x2 tD , respectively. If the distance d is greater than the specified tolerance, as is the case for individual 1, the trajectory is considered to be infeasible; the cost assigned to the individual is then the addition of a large infeasibility constant K and the distance d, which provides the search with directionality information even from infeasible individuals. If the trajectory is feasible, as results from the last thrust arc of individual 2, the cost assigned to the individual is no longer related to the distance d, but instead it is based on the original A B C D X1 X2 thrust coast thrust Fig. 4 State trajectory of multiphase low-thrust mission. X1 X2 C1 D x1(tD) d1 > tol d2 toltolerance region C2 x2(tD) Fig. 5 Feasibility determination of boundary-specified thrust arcs. 8 AIAA Early Edition / CHILAN AND CONWAY DownloadedbyUNIVERSITYOFLIVERPOOLonAugust29,2013|http://arc.aiaa.org|DOI:10.2514/1.58766
9. 9. cost metric, i.e., the amount of propellant used in the entire mission. This method can be implemented using the following conditional fitness function: J K d if d > tolPNs j 0 g qj if d ≤ tol (23) The CP method requires the tuning of only two parameters. The infeasibility constant can be trivially set to a value that is higher than the expected cost of any feasible solution. In general, the tolerance should be set to low-accuracy values in the order of 10−1 to 10−2 for normalized problems, because evolutionary metaheuristics are less accurate than gradient-based deterministic methods such as NLP. High-accuracy tolerances are likely to cause overpenalization, i.e., convergence to a feasible solution with a cost that is significantly higher than the optimal cost. C. Methods of Solution for the Optimization of a Multiphase Space Trajectory Direct methods have become a popular choice for trajectory optimization problems because they avoid the difficulties related to the derivation and solution of the TPBVP resulting from the Euler–Lagrange equations of the COV. Unlike GA solutions, trajectories found using NLP satisfy optimality conditions to a specified tolerance, which yields accurate solutions. Schemes for the transformation include Hermite–Simpson collocation [18], Runge– Kutta transcription with parallel shooting [7], and Gauss–Lobatto collocation [19], which vary in their degree of accuracy and robustness. Regardless of the scheme chosen, the NLP trans- formation consists of discretizing time and modeling the continuous- time state and control variables at several points in time. The system dynamics and boundary conditions are satisfied by defining nonlinear constraints that relate nodal NLP parameters. Creating such a NLP structure of parameters and constraints is an involved process that the mission planner has to perform each time the optimization problem is formulated with a different structure. It is clear that this approach is not practical for a mission “automaton,” because the problem statement for the inner-loop calls for the optimization of trajectories with a variety of structures. Therefore, a modular scheme is needed for the automatic construction of NLP problems during runtime. The basic module in this work is the categorical event, which consists of a set of NLP parameters and constraints [9]. A property of every event is the presence of terminal nodes that allows representing initial and boundary states. This property, along with the definition of continuity constraints on the states during the event transitions, allows assembling the events successively in time regardless of their internal dynamics in a fashion similar to that described by von Stryk and Glocker [10] as shown in Fig. 6. The internal constitution of each event module depends on the type of event it represents [9]. The NLP representation of impulses is implemented as follows: 1) An impulse is represented by a knot, i.e., two nodes corresponding to the states immediately before and following the impulsive maneuver. 2) Two additional parameters are used for the direction and magnitude of the impulse. 3) Five nonlinear constraints ensure that the parameters satisfy the change in velocity and mass described in Eq. (16). In a similar way, the NLP representation for coast and thrust arcs is the following: 1) A coast/thrust arc is represented by a standard mesh for parallel- shooting using RK integration rule. One RK step per segment has been selected for maximum robustness. 2) An additional parameter is added for the flight time. 3) Five nonlinear constraints on each mesh segment ensure that the parameters satisfy the EOMs in Eq. (13). 4) Additional constraints may be imposed depending on the desired maneuver, e.g., interception or rendezvous. Note that there is no particular event definition for a Lambert’s rendezvous, because the NLP problem does not employ any of the algorithms to solve Lambert’s problem. When the inner-loop is assigned to optimize a trajectory containing this type of event, it assembles an impulse, a coast arc, and another impulse, and defines the nonlinear constraints required to satisfy the boundary conditions. By parsing the events provided by the outer-loop, this modular approach allows the inner-loop to set the vector of NLP parameters with suitable bounds and to compute the appropriate nonlinear constraints. The modeler has the discretion to define the upper and lower bounds for the parameters in a way that is appropriate for the problem in consideration; the boundsfor the GA and NLPeventsmay or may not be similar as long as the GA search space is a subset of the NLP search space. This is a necessary condition for the GA solution to be a feasible initial guess for the NLP problem. Regarding the terminal constraints, numerical experimentation shows that the NLP solver finds the optimal solution more robustly if the interception constraints, i.e., position matching, are transformed to Cartesian coordinates instead of directly using polar coordinates. The cost function is a measure of the propellant used throughout the trajectory. In this work, it has been constructed as the sum of the normalized thrusting times corresponding to the thrust arcs and those resulting from the transformation of the impulse magnitudes as defined in Eq. (20). It is important to note that a feasible trajectory may not exist for some values of the categorical variables, e.g., number of revolutions, and the constraints for the continuous variables, e.g., boundary conditions and flight time. For these cases, the inner-loop assigns a large constant cost to the event sequence provided by the outer-loop. D. Transformation of the Approximate GA Solution into the NLP Initial Guess The approximate solution for the trajectory optimization problem found using the conditional penalty method [9] is a good initial guess for the NLP problem solver, because the EOMs are satisfied and the propellant consumption has been minimized heuristically. However, such a solution cannot be used directly by the NLP solver, because the GA and NLP parameter representations are different. For example, the GA model does not use parameters to represent state variables; they are computed during the evaluation of each event taking as initial state the boundary state of the previous event. Therefore, a procedure is required to generate guess values for the respective NLP parameters from the approximate GA solution. For an impulse, the GA values for the initial state and the impulse direction and magnitude can be used directly as guess values for the NLP parameters corresponding to the state at initial node and the direction and magnitude of the impulse. The vector operations in Eq. (16) describe the impulse dynamics and generate guess values for the state at the boundary node of the event. For a coast arc, the GA values for the initial state and flight time can be used directly as guess values for the NLP parameters corresponding to the state at the initial node and the flight time. Performing an integration of the system in Eq. (13) yields guess values for the state parameters at the inner and boundary mesh nodes. Fora thrust arc, the GAvalues for the initial state and the flight time can again be used as guess values for the NLP parameters for the state at the initial node and the flight time. To obtain guess values for the states at the inner nodes, the few GA control parameters are t0= 0 tNs1= tf event q0 t1 t2 ti ti+1 event q1 event qi x = f(x,u,q0 ) x = f(x,u,q1 ) x = f(x,u,qi ) x(0) x(tf) Fig. 6 Assembly of events in a multiphase trajectory. AIAA Early Edition / CHILAN AND CONWAY 9 DownloadedbyUNIVERSITYOFLIVERPOOLonAugust29,2013|http://arc.aiaa.org|DOI:10.2514/1.58766