Online decision making and path planning framework for safe operation of unmanned aerial vehicles in urban scenarios

As the potential for deploying low-flying unmanned aerial vehicles (UAVs) in urban spaces increases, ensuring their safe operations is becoming a major concern. Given the uncertainties in their operational environments caused by wind gusts, degraded state of health, and probability of collision with static and dynamic objects, it becomes imperative to develop online decision-making schemes to ensure safe flights. In this paper, we propose an online decision-making framework that takes into account the state of health of the UAV, the environmental conditions, and the obstacle map to assess the probability of mission failure and re-plan accordingly. The online re-planning strategy considers two situations: (1) updating the current trajectory to reduce the probability of collision; and (2) defining a new trajectory to find a new safe landing spot, if continued flight would result in risk values above a pre-specified threshold. The re-planning routine uses the differential evolution optimization method and takes into account the dynamics of the UAV and its components as well as the environmental wind conditions. The new trajectory generation routine combines probabilistic road-maps with Bspline smoothing to ensure a dynamically feasible trajectory. We demonstrate the effectiveness of our approach by running UAV flight simulation experiments in urban scenarios.


INTRODUCTION
The use of Unmanned Aerial Vehicles (UAVs) is increasing at unprecedented rates across across a wide spectrum of applications that include surveillance, package delivery, photography, cartography, remote sensing, agriculture, military mis-Marcos Quiñones-Grueiro et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. sions, and more 1 . The FAA passed regulations in 2016 that authorized the commercial use of UAVs. As adoption and use of these vehicles increase, so does the risk of collisions and crashes that can result in a loss of money, time, productivity, and most important, human lives.
There is an abundance of research on the technical aspects of UAV systems: their design, implementation, stability, operation, and fault diagnostics (Moir & Seabridge, 2012). However, a more holistic approach to ensuring safe operations in a heterogeneous airspace is required to address this multifaceted problem that combines component and system diagnostics, system-level prognostics, airspace safety assurance, flight path risk assessment, as well as trajectory planning and replanning, just to name a few (Clothier & Walker, 2006).
Our overall goals are to support safe operations of package delivery UAVs operating in urban environments. A decisionmaking framework is developed to maintain system safety during a UAV mission, i.e., flight from a starting point to a destination going through a sequence of pre-determined way points. Our objectives are to minimize the overall risk of mission failure by simultaneously considering: (1) a number of risk factors (e.g., risk of collisions), (2) uncertainties in the environment (e.g., wind gusts), and (3) the operating state of the vehicle (e.g., degradation in the UAV components). In this paper, we develop a Risk analysis framework to compute and update risks associated with projected UAV flight paths by considering two potential hazards:(1) collision with static obstacles, and (2) depletion of battery charge below a pre-specified safe threshold. In addition, we also take into account the effects of degradation of system components on overall UAV flight performance. In general, multiple components of a system may degrade at the same time, therefore, we apply methodologies for a system-level prognostics that we have developed in past work (Khorasgani et al., 2016).
Ensuring safe UAV flight operations requires careful flight planning and trajectory generation based on a series of fourdimensional waypoints (latitude, longitude, altitude, and arrival time), while also satisfying a set of constraints that assure safety during flight (e.g., distance from nearest obstacle 2m, battery charge 10%, risk threshold  15%), and optimizing a set of performance parameters (e.g., flight time, power consumption). We assume our package-carrying UAV is a low flying craft that typically flies over city streets between large buildings. Our path planning algorithm assumes the existence of a map of possible routes that UAVs can fly within this urban environment. Trajectory generation requires deriving a stable trajectory, defined by a set of waypoints, with the assumption that the UAV, if undisturbed will fly a straight line path between two consecutive way points. The UAV trajectory is smoothed using B-splines at corners, ensuring that the UAV does not have to make sharp turns when it negotiates these corners.
In situations where there are no ambient disturbances, such as wind direction changes or wind gusts, we assume the UAV chooses the minimum distance path between its start and destination points because that consumes the least battery power (energy), which, therefore, involves the least risk. However, wind gusts or change in environmental conditions in the vicinity of where the UAV is flying, may result in a change in flight risk causing the risk values to exceed specified thresholds. In extreme circumstances, the UAV may decide that it is too risky to proceed. In such situations, the UAV executes emergency landing procedures to prevent crashes. We do not deal with contingency management situations in this paper.
The rest of this paper is organized as follows. Section 2 reviews the literature on UAV safety analysis. Section 3 outlines the proposed framework of our work. Section 4 provides a description of our octocopter UAV system model and the test bed for our experimental studies. Section 5 presents our online decision making and path planning framework. Section 6 presents the optimization methods used to solve the trajectory re-planning problems. The results of our experimental case studies run on the octocopter system under different scenarios are presented in Section 7. Finally, Section 8 presents the conclusions of our work and directions for future research.

BACKGROUND: SAFETY ANALYSIS, HEALTH MAN-AGEMENT AND ONLINE PLANNING FOR UAVS
Past work on UAV safety analysis has focused on collision avoidance in the civil airspace for commercial aircraft with recommendations for developing and advancing Traffic Collision and Alert Systems (TCAS) that can detect and minimize these collisions (Kuchar, 2005). Clothier & Walker (2006) contend that overall safety requirements for UAV systems should be the same as that for human-piloted aviation. They developed a simple simple fatality model to illustrate the impact of different safety objectives on the design and operation of UAV systems. They used comparative examples to highlight the importance of the nature of risk exposure to the type of operation being performed.
Traditionally, contributions from the field of robotics have evaluated risk without considering the state of the vehicle. For instance, data driven approaches to risk analysis have taken into account uncertain weather conditions (Rubio-Hervas et al., 2018) and measurement uncertainty (De Filippis et al., 2011) together with the vehicles kinematics, but they have ignored the fact that a UAV's operating conditions can change during flight. On the other hand, contributions from the field of control systems incorporate trajectory planners that consider the dynamic model of the vehicle (Brown & Rogers, 2016), but do not take into account its state of health for risk analysis. However, in urban scenarios, it is important to consider the interactions between the system's state and the varying environmental conditions (Coutinho et al., 2018).
Computing risk and safety is another area of research that is a key component of ensuring safe UAV operation. Lin & Shao (2020) compute the expected level of safety (ELS) of a path as a function of mean time between failures, the area of exposure in square meters (assuming a ground impact), the population density, and accident severity. Ancel et al. (2017) and Ancel et al. (2019) present a real-time risk assessment framework that quantifies the risks to bystanders for operations in populated areas. They consider three separate modules: • a Probabilistic Graphical Model based on Bayesian belief networks (BBNs) that estimates the likelihood of predetermined mishaps using the aircraft health systems status provided as part of the aircraft telemetry downlink. The BBNs are derived from a FMEA of the system and they are parameterized based on the dynamic aircraft health data; • an Off-Nominal Trajectory and Impact Point Prediction module that provides an expected impact point for the modeled mishap with an associated uncertainty bound; and • a Casualty Estimation module that employs the impact point and associated uncertainty as well as population distribution data to evaluate injury or fatality to humans on the ground.
This approach, however, does not formally address the dependence between the probability of collision with obstacles and the state of health of the UAV. Primatesta et al. (2019) proposed an off-line and on-line path planning strategy that considers a dynamic risk map based on the probability of casualties caused by the loss of control of the vehicle. Their path planning strategy only considers the kinematic constraints associated with the vehicle. Schacht-Rodríguez et al. (2018) present a path-planning strategy that takes into account the available battery power, while also considering the degradation of the battery. Krishnan & Manimala (2020) consider both the minimal path length and the minimal risk of collision for designing a path-planning algorithm suitable for real-time applications. They, however, do not consider the state of health of the UAV in the optimization problem. Benders et al. (2020) propose an adaptive path planning strategy that takes into account performance uncertainties associated with the parameters of the kinematic model of the UAV.
Real-time path planning for UAVs in the context of obstacles as dynamic traffic and geofences is explored in Chatterjee et al (2019 (Aggarwal & Kumar, 2020). Those approaches allow considering the kinematic constraints associated with the UAV in the path planning method. However, they do not consider the dynamic constraints associated, for example, with faults or the degraded behavior of one or more components of the UAV.

PROPOSED FRAMEWORK
Earlier state-of-the art methods discussed do not take into account the dynamic model of the UAV, the degradation of its components, the obstacle map, and the environmental conditions, and how all of these together may affect the flight performance. The proposed novel approach presented in this work, integrates multiple adverse effects to define the likelihood of mission failure, and, when needed, executes path replanning as a function of all the above-mentioned elements. Probability of mission failure is calculated based on both the remaining useful life (depending on component-level performance measures, such as the battery state of charge), and the probability of collision (which also depends on the health of the components and the environmental conditions, i.e. the trajectory tracking error will worsen as the motors degrade and/or under increased wind conditions). An analytic frame-work is implemented, that tracks the degradation rate of individual components, such as the battery and motors from mission to mission, to update the dynamic model of the UAV. Based on that model, we continuously update the predicted remaining trajectory of the UAV to decide when re-planning is required. We make the following assumptions • We can estimate the state variables and parameters of the UAV and its components within an error bound; • An online fault detection and diagnosis module is available; • The obstacle map information is available, this includes a database with semantic geo-referenced data, such as terrain and land-use maps, obstacles, and airspace restrictions; • The environmental conditions do not change dramatically during flight, i.e., wind gusts might be present but their intensity does not change during flight; and • There are no communication issues observed between the UAV and a primary cloud-based station.
In future work, we will relax some of these assumptions to consider conditions and a framework that includes more realistic scenarios. WE should also clarify that the objective of the proposed approach is to assess the probability of mission failure, and when that falls below a pre-specified threshold, to re-plan and derive safer trajectories. Therefore, we define the risk of mission failure as: the product of the likelihood of mission failure and the consequences of mission failure. With this proposed framework, we can seamlessly integrate with modules for impact point prediction and casualty estimation developed by (Ancel et al., 2017(Ancel et al., , 2019 and (Primatesta et al., 2019).

UAV SYSTEM DESCRIPTION
In this section, we described the modeling of an octocopter UAV with its components as well as the implemented control system.

UAV Model
The octocopter airframe dynamics model is based on Newton-Euler equations of motion for a rigid body (Powers et al., 2015). The octocopter is composed of a central hub with 8 arms extending radially and a motor at the end. The derived body forces considered are where F b 2 < 3 is the resulting force acting on the body frame, F M 2 < 3 is the resultant force generated by the motors, F D 2 < 3 is the drag force resulting due to the movement of a UAV through the air, m is the mass of the UAV, g is the gravity acceleration, R IB 2 < 3⇥3 is the rotation matrix from the inertial frame to the body frame, and e z = [0 0 1] T .
The rotation matrix is calculated based on the Euler angles [ , ✓, ] (Mahony et al., 2012). The order of rotations from the inertial frame to the body frame considered in this work is yaw (Z)-pitch (Y)-roll (X) assuming the UAV is moving forward in the positive X direction.

Motor forces and torques
The eight rotating motors including propellers are used to generate motor forces F M and torques M M . The disposition and rotation of the motors is shown in Figure 1. For each motor i, the force and torque generated are given by where c T is the coefficient of thrust and c Q is the torque constant. Both parameters can be easily estimated from static thrust tests (Mahony et al., 2012). The net force applied to the airframe is the summation of the forces generated by the motors. The net torque acting on the octocopter arises from the aerodynamics (the combination of the produced rotor forces and air resistances) applied to the N-rotor vehicle.

Drag force
Form drag is the most common and easily modeled aerodynamic effect. Form drag arises due to the movement of a reference area through a fluid. The general expression of the form drag force is where ⇢ is the air density, C D is the drag coefficient, A the reference area that is perpendicular to the velocity of the object (v b ). The torque generated due to drag is considered negligible in this work.

Wind effect
Wind is a major disturbance source for light aerial vehicles, such as octocopters. The ability to remain stable and minimize wind disturbances is a requirement for the use of UAVs in urban scenarios. The wind velocity vector is defined by where v w,nom is the nominal component and v w,rand is the stochastic component. A valid representation of the latter term can be generated by using Dryden's turbulence model. The wind affects the translation of the octocopter by modifying the drag force (Gougeon et al., 2018). The change in the moment dynamics caused by the wind acting on the propellers of the motors is assumed to be negligible in this work.

Propulsion system
The propulsion system of the octocopter is formed by a set of BrushLess DC Motors (BLDCM), an Electronic Speed Controller (ESC), and a Lithium Polymer battery. Each motor is powered by the battery through the ESC, which controls the angular speed through a PWM signal. The dynamics of the ESC are neglected in this work. We assume, therefore, that the voltage supply of the battery and the current consumed by the motors are averaged with respect to the duty cycle value of the control signal.

Motor model
A Brushless DC motor is a type of permanent magnet synchronous motor driven by a DC supply voltage. The simplified mathematical model can be described by the following equations (Schacht-Rodríguez et al., 2018) where R e q = 2 3 P 3 j=1 R j is the equivalent electric resistance of the coils, K e is the back electromotive force constant, ! i is the angular velocity of the ith BLDCM, T f is the static friction torque, D f is the viscous damping coefficient that allows to estimate the dynamic friction torque (D f !), J m is the inertia of the BLDCM, v DC is the input voltage control signal, i c is the current demanded from the battery pack, and T load represents the torque load generated by the propellers. The relationship between the torque load and the angular velocity is non-linear and can be defined from experimental data according to the dimensions and material of the propellers.

Battery model
The dynamics of a battery cell in the electrical domain can be modeled using an equivalent circuit representation (Plett, 2015). The dynamic equations that characterize the battery behavior are given by: where Q represents the total capacity of the battery, i c the input current, R d and C d are the diffusion resistance and capacitance, i d the current going through the diffusion resistance, R 0 represents the internal resistance, V ocv is the open circuit voltage, and V out is the output voltage of the battery. The open circuit voltage depends non-linearly on the state of charge of the battery. The function that characterizes such relationship can be experimentally estimated and approximated as a polynomial by using charge and discharge data.

Navigation system
The standard navigation system of a UAV carries an inertial measurement unit (IMU) and a global positioning system (GPS), with additional augmentation provided by augmented by imaging systems that provide additional information, especially when the GPS signal is lost in urban environments (Conte & Doherty, 2008). A typical IMU includes a threeaxis rate gyro, a three-axis accelerometer, and a three-axis magnetometer.

Control scheme
The octocopter's hierarchical control scheme is shown in Figure 2. This control approach is commonly used for multicopters since it allows for stabilization of the position and orientation of the octocopter with respect to a trajectory. The low level controls the vehicle attitude, and the top level controls the position along a trajectory, together forming nested feedback loops.  The reference trajectory for the octocopter can be defined for each time step in terms of position and yaw angle [x t , y t , z t , z t , t ].
Altitude is controlled through a Proportional Integral Derivative (PID) controller based on the reference input. This controller generates the required force in the z direction. The position controller block estimates, based on the current position of the vehicle and yaw angle, the reference for the pitch (✓) and roll ( ) angles. The attitude controller block is formed by three PID controllers for pitch, roll, and yaw. This block generates the required torque in each direction. The control allocation block transforms the required torques and force into a reference voltage for each BLDC motor of the octocopter based on the maximum voltage that the battery can provide. Finally, the motor block generates the angular velocities according to the motor's dynamics.

Degradation of Components
The state of health (SOH) of a component describes its actual physical condition in relation to its nominal working condition. Typically, the SOH is described by one or more parameters that are linked to the behavior of the system. A function that aggregates the deviation of these parameter values from their nominal values defines the SOH of the component. Typically, the degradation of the components of the octocopter increases monotonically from mission to mission. The SOH of the components of the octocopter affect its overall performance.
BLDC motors are susceptible to mechanical degradation in the form of bearing wear, and less common electrical degradation in the form of contact corrosion and insulation deterioration (Abramov et al., 2014). An increase in winding resistance results in increased power consumption under load (and decreased consumption under no-load conditions), as depicted in the FMECA chart provided in Kulkarni & Corbetta (2019). The winding resistance change is associated with the electric resistance of the coils (R e q). Mechanical degradation is associated with the static and dynamic friction parameters (T f and D f ) of the motor model presented above. The SOH of the battery decreases with time due to aging mechanisms caused by its charge/discharge cycles, and damage due to deep discharges. The aging of the battery causes an energy loss leading to capacity loss (a reduction of battery capacity), and the increase of the battery impedance (power fade). Capacity loss is evidenced in the battery model by the reduction of the total capacity parameter Q, and the power fade by the increase of the internal resistance R 0 (Daigle & Kulkarni, 2016;Sierra et al., 2019).

DECISION MAKING AND PATH PLANNING APPROACH
As discussed earlier, we develop a risk-based framework for maintaining system safety during a mission by considering a number of risk factors along with uncertainties in the environmental and operational state of the vehicle (UAV). In this paper, We define mission failure, i.e., the loss of the vehicle due collisions with other objects (primarily static objects, such as buildings or towers or moving objects, such as other UAVs) or due to a crash to the ground. The primary causes of mission failure are: (1) Figure 3. Online re-planning framework system faults that degrade system performance. Faults may be associated with different components of the UAV, such as the battery, the motors, and the Inertial Measurement Unit.
We assume that the UAV system has a mission plan, defined by a set number of way-points = {c 1 , c 2 , ..., c n } and a desired cruise speed at the start of its mission.The risk associated with the current flight trajectory, given by the current plan is continually computed taking into account the effect of current conditions, which may include adverse situations, such as a wind gusts (weather-related condition) or component degradation and failures (faults). When the risk of the current trajectory increases above a pre-specified threshold, the online re-planning algorithm depicted in Figure 3 is initiated. The algorithm includes four processing modules: (1) state estimation, (2) prediction, (3) mission failure assessment, and (4) trajectory replanning. The state estimation module determines the system state and parameters and the model uncertainty. The prediction module considers the current state, system parameters, model uncertainty, and information about the environmental variables as initial conditions, and performs stochastic simulations of the remaining trajectory for the current mission plan. The mission failure assessment module computes the risk of mission failure for the remaining trajectory, and initiates re-planning if the predicted risk exceeds the pre-specified threshold. Finally, the trajectory re-planning module, if invoked, generates a new trajectory for the remaining mission, so that the risk of mission failure is reduced to an acceptable value. If such a trajectory cannot be found, the re-planning algorithm will attempt to find a trajectory that results in an emergency landing to avoid a crash.
Risk computation by stochastic simulation and the replanning algorithms are computationally expensive, and typically, there is not enough processing power on a small octocopter to compute them in a timely manner, Therefore, we assume a two part computational architecture for our UAV safety analysis system. First, an on-board processor performs the monitoring, fault detection and isolation, and state estimation tasks.
The computational intensive predictions of mission failure and replanning are performed off-board using a cloud-based infrastructure that communicates continually with the UAV system. The rest of this section discusses the four modules in greater detail.

State estimation
In this work, our state estimation primarily focuses on the state variables associated with the vehicle flight trajectory, i.e., the height, attitude, angular velocity, linear velocity and position of the octocopter system in 3-D space. In addition, we have designed state estimators for the primary components of the UAV that need to be monitored for system health, for example, the state of charge and the output voltage of the battery.
We estimate the attitude, velocity, and position of the UAV using complementary filtering techniques (Mahony et al., 2012), or using extensions of Kalman filtering methods (Burri et al., 2015;Yang et al., 2017;Al-mashhadani, 2019). Inertial and visual sensor fusion algorithms can be employed if imaging sensors are available (Corke et al., 2007). State estimation of the motors can be performed using extended Kalman filters. A number of methods, such as unscented Kalman filters and particle filters have been proposed for state and parameter estimation of the battery (Plett, 2015;Sierra et al., 2019). In this work, we consider the use of an Sigma-Point Kalman filter (Merwe et al., 2004).

Sigma-Point Kalman filter
The Sigma-Point Kalman filter (SPKF) is a second-order nonlinear state estimation method based on probabilistic inference theory. SPKF approximates the mean and covariance of the posterior distribution of the state variables by using a set of sigma-points. Each Sigma point represents a random state variable formed by concatenating the original state and noise variables as follows: x a n = [x T n w x T n w y T n ] T . The sigma points n 1 are obtained in a deterministic fashion using n = {x a n x a n + p P n x a n p P n }, where n represents the set of Sigma points, P n is the error covariance matrix, and is a parameter that determines the spread of the Sigma points. P n is formed by three subdiagonal blocks based on the state error covariance matrix P x , the measurement error covariance matrix R, and the process noise covariance matrix Q. Both R and Q are parameterized offline based on sensor error analysis Gross et al. (2010).
The Sigma points are propagated to estimate the state variables, which are later updated according to the measurements obtained from the system inputs and outputs. Full implementation details can be found in Merwe et al. (2004). We use the unscented Kalman filter as our Sigma-point selection scheme, and describe the equations used to estimate the position and the battery state of charge next.

Inertial navigation equations
The navigation state vector is formed by the local Cartesian position and velocity as well as the roll, pitch, and yaw angles. The IMU measures the three axis accelerations and angular rates. Since x = [x, y, z, v x , v y , v z , , ✓, ], the navigation update equations in continuous time are written as follows 2 4ẋ ẏ where R be ( , ✓, ) is the rotation matrix from the body frame to the inertial/world frame known as Direction Cosine Matrix. The IMU measures the acceleration and angular rates [a x , a y , a z , p, q, r] which are considered the inputs of the filter, and the GPS measures position and velocities [x, y, z, v x , v y , v z ], which are considered the output measurements Gross et al. (2010).

Battery model equations
The mathematical model of the battery is given by the set of equations (7) (9). The relationship between the battery state of charge (SoC) and the open circuit voltage (OCV) for a cell is shown in Figure 4. The measured current is considered the input of the filter and the measured output voltage is considered the output.

Prediction
The prediction module computes the states of the UAV for the remaining of the reference trajectory at each time step. Uncertainty propagation is performed by sampling from the distribution of the state variables as the trajectory is completed.
Considering the state space model of the UAV y n = h(x n , u n , # n , w y ), where x n represents the set of variables that characterize the dynamics of the UAV, u n is the inputs of the system, # n is the set of degrading parameters, w x and w y capture the system The distributions of w x and w y are required to propagate the state of the UAV along the remaining trajectory. The distribution of w y is estimated offline based on the precision of sensors. The distribution of w x is estimated online based on the state estimators designed for the UAV. More specifically, the process noise associated with the position of the UAV is estimated. This allows us to compute the probability of collision with a given obstacle. The SPKF, discussed previously, has been used in previous work to estimate the position error of the UAV (Merwe et al., 2004). Similarly, the SPKF has been used to estimate the state of charge and voltage of battery packs (Plett, 2006). We, therefore, use the SPKF for process noise estimation of the states of the UAV position and attitude, as well as for the battery. Another component that affects the system dynamics is the wind vector. We assume this information is provided by an external meteorological (i.e., weather forecasting) center. This provides the parameterization of equation 4.

Mission failure assessment
Risk of mission failure is performed for a predicted trajectory by taking into account the uncertainties associated with the state of the UAV and the measurements. This computation can be performed for the current trajectory, starting from the current time point forward, or a new trajectory generated by the replanning algorithm from the current time point forward till the end of the mission. In this paper, we compute the risk of mission failure as a product of two factors: (1) the risk of collision and (2) the risk of running out of battery power. The risk of mission failure is defined as follows: where f mf ail ( ) is the risk of mission failure, F c ( ) is the maximum risk of collision with an obstacle, and F rul ( ) is the maximum risk of crashing because the UAV runs out of battery power along the trajectory . We simplify the solution process for a multi-objective optimization problem, which may not be tractable, replacing it with the worst case scenario risk calculation for the two factors of mission failure. The maximum risk of mission failure is used to determine whether re-planning is required based on a pre-defined failure threshold ⌧ fail 2 [0, 1]. We describe how each term of equation 16 is computed next.

Risk of collision with an obstacle
The total risk of collision of a mission is accumulated along a specific trajectory that has a set of known K static obstacles in the environment in the vicinity of the trajectory. The average risk can be calculated according to: where f ( , obst k ) is a function that calculates the probability of collision of the UAV with obstacle k when it follows trajectory . For convenience, we convert the trajectory of the UAV into a discrete form by considering a sampling time. Thus the total risk can be expressed as: where (n) represents the position of the UAV at time point n, and f ( (n), obst k ) takes the following form where D n obst k represents the distance between the UAV and the obstacle k at time n of the trajectory, and dsaf e k represents a minimum distance separation that must be kept between the UAV and the obstacle to avoid an imminent collision. We assume the position of the UAV is a stochastic variable whose distribution is estimated through the state estimator described in Section 5.1. , D obst k is also a stochastic variable normally distributed whose standard deviation can be inferred from the position variance (estimated through the SPKF). Equation (19) evaluates the inverse cumulative distribution function of D obst k at dsaf e k .
The position of the UAV is represented by the coordinates of its center of mass. Assuming a two dimensional map, we consider a set of static obstacles that can be represented by rectangles defined in terms of their four corner coordinates on a 2D map of the UAV spatial environment, i.e., obst k = (x i , y i ), i = 1, 2. For a realization of the predicted future trajectory, the distance is computed as d obst k = |⇠ ⌫ k |, where ⇠ is a vector of the coordinates of the UAV and ⌫ k is the closest vertex of obstacle k. The distance threshold dsaf e k depends on a number of factors that we discuss next.
The total risk of collision provides the basis for offline path planning. We reiterate that re-planning is triggered because we predict that the minimum separation distance required between the UAV and the closest obstacle will be violated for a segment (s) of the remaining trajectory with a computed risk value that exceeds a threshold.This triggers an evaluation of multiple alternate trajectories, and selection of the one that is considered to be the safest in terms of risk and does not violate the minimum separation distance. Therefore, the objective of re-planning is to update the expected risk of the chosen trajectory, as a result of which, we also reduce that probability of collision to an acceptable value.
The maximum risk (and, therefore, probability of collision) for a trajectory is computed with respect to the closest obstacle using the equation (18): As a summary, in this work, F max c is equated to Risk max , and determines the objective to be minimized by replanning.

Defining a distance threshold
The distance threshold dsaf e k can be established in multiple ways taking into account different factors like the dimensions and orientation of the UAV, errors introduced in the discretization of the obstacle map, the velocity of the UAV as well as communication times. Static operational safety bounds and dynamic operational safety bounds are two concepts implemented for UAV operations. The key difference in implementation of a static operational safety bound is it does not need any specific knowledge about the UAV operating in the airspace. Dynamic operational safety bounds use the velocity information for the UAV and wind data for safety bound calculation (D'Souza et al., 2016;Zhu & Wei, 2016).
Recently, Hu et al. (2020) proposed a dynamic operational safety bound method that also includes more information about the UAV (such as performance data and system updating information taking into account communication standards). A warning zone for collisions form the proposed dynamic operational safety bound. The collision zone is defined by considering the minimum stopping distance required for the UAV to decelerate until it can hover. The warning zone takes into account the sampling time to expand the collision zone. Moreover, the bound considers the vectorized shape of a UAV with different speed components that are parallel and perpendicular to the heading direction. This shows how the isotropic design reduces the capacity of the airspace. It is also interesting to see that their proposed dynamic operational safety bound is generally longer in the direction parallel to the heading direction. This can be attributed to the large speed component along the heading direction. he DA bound proposed by (Hu et al., 2020) can be integrated into the decision-making and path-planning framework pre-sented in this paper. This will take into account more realistic scenarios. However, to show the feasibility of the proposed framework, a static bound was used for the experiments reported in this paper.

Risk of crashing due to loss of battery power
The risk of running out of battery power is computed for an existing trajectory or a new trajectory by considering the uncertainties associated with the state of charge of each battery cell in the battery pack. This is then translated to the remaining useful life (RUL) of the UAV as a function of the state of charge of the battery.
The RUL estimation encompasses two distinct but related problems:(1) estimating the current system state and the degradation rates of individual components; and (2) predicting future system performance to derive a system RUL function. To compute the RUL, we follow an analytic framework that tracks the degradation rate of individual components, such as the battery and motors from mission to mission. Based on the degradation parameters estimated at the beginning of a mission, we use methods to compute and predict the state of charge over a specified trajectory. The system-level prediction to compute the system RU L n distribution function at time instant n defined by: where t is the sampling time of the system, n 2 N represents the sampling step, and EOL n represents the end of life of the system defined by: where T (} z ) is the following system performance threshold function R(} z ) denotes a set of performance constraint functions that map the performance of the system to a Boolean domain [0, 1]. A value of 1 implies that the constraint is satisfied, and a value of 0 implies that it is not. This set of constraints is defined based on the system degradation model but a single constraint is defined in our case based on the state of charge of the battery.
The risk of not completing the mission is calculated at each time step n as: where t end = N ⇤ t represents the time planned for the UAV to reach the final way-point. We calculate F max rul as the maximum of F rul .
Note that the probability distribution of RU L n , as well as the accumulated and maximum risk (Risk tot n , Risk max n ) associated with the current trajectory are calculated considering the distribution of the system states, nominal and degrading parameters, p(x n , # n , ↵ n |y n ), the distribution of future inputs to the system, and the distribution of the system model uncertainties.

Trajectory re-planning
In this work, our proposed re-planning strategy does not address the UAV take-off and landing phases. Our re-planning strategy is only applied to in-flight way-point adjustments. We assume the UAV flies at a constant altitude (constant z value), therefore, the flight coordinates of interest for optimization are the two continuous variables that define the x-y motion of the UAV. Once the UAV arrives at the landing spot way-point in the x-y plane, we consider that it enters into a different operating mode, and a different planning strategy is required to implement safe landing. This may involve other sensors, such as visual sensors (Lusk et al., 2019).
The trajectory re-planning module is activated when the mission failure assessment module determines that the risk of mission failure is higher than a pre-specified threshold ⌧ fail 2 [0, 1]. Trajectory re-planning is solved as an optimization problem that is constrained by the time to failure along the trajectory. The solution of the re-planning problem depends on whether the risk of mission failure is caused by the collision with an obstacle or the UAV running out of battery power. We formalize the solution to each problem next.

Trajectory re-planning to reduce the risk of collision with an obstacle
The trajectory re-planning routine is activated when the risk threshold ⌧ fail is violated for a segment of the remaining trajectory (F max c ( ) > ⌧ fail ). The re-planning optimization problem can be formalized as follows: where ij represents a trajectory between way-point c i and c j , ij represents the set of possible trajectories between the two way-points, the function f max c estimates the maximum probability of mission failure along ij . A sufficient stopping condition for this optimization problem is f max c ( ij ) < ⌧ saf e with ⌧ saf e 2 [0, 1] representing a threshold for the maximum allowed probability of collision. In the context of dynamic systems, the optimization problem above is subject to the following constraints: • Initial state of the system: wherex,#, andŵ x represent the estimated state, degradation parameters, and model uncertainties at n = 0.
This constraint defines the initial condition of the system to solve the optimization problem.
• Dynamics of the system: This constraint means that the system dynamics must be satisfied between any two way-points i and j to be flown.
• Time constraint: This constraint means that a solution must be found in the time frame defined by T avail . We discuss the selection of T avail in the experimental section of this paper.
Note that the latter constraint is defined by the time to collision in the trajectory, which is estimated using the prediction module. Therefore, a trajectory that reduces the risk of collision below an acceptable threshold level must be found in the available time before the collision occurs. Moreover, the expected communication delay has to be considered with the estimated time to collision. This dictates the parameterization of the optimization algorithm to satisfy computational time requirements. Alternatively, if the estimated computation time exceeds the allowable computation time, an alternate landing trajectory must be generated. In other words, if the optimization problem cannot be solved in the given time, the UAV must perform an emergency landing at the closest available landing spot. To accommodate these constraints, we propose a Differential Evolution optimization algorithm Storn & Price (1995) to formalize the path replanning problem as an anytime optimization algorithm. Details of our proposed approach are presented in the next Section.

Trajectory re-planning to reduce the risk of running out of battery charge
The trajectory re-planning routine is activated when the risk of running out of battery power is higher than a pre-specified failure threshold (F max c > ⌧ fail ). Re-planning, in this case, involves finding the nearest available emergency landing spot from the current position of the UAV, and deriving a dynamically feasible trajectory to it. For emergency landing, a set of available landing spots and their coordinates should be known beforehand.
The mission time to failure, in this case, depends on the available battery power. Therefore, T avail is approximated by the prediction module. The re-planning is performed as a twostep procedure. In the first step, the shortest trajectory between the current position of the UAV and a chosen landing spot is computed by using the probabilistic roadmap path planning algorithm. In the second step, this trajectory is transformed into a smooth and dynamically feasible trajectory by using the B-spline curve approximation method. Additional details on these two methods are presented in the next section.

OPTIMIZATION METHODS
We propose to use different methods to solve the two replanning tasks presented in the previous section. First, Differential Evolution to reduce the risk of collision with obstacles, and second, a combination of Probabilistic Roadmap path planning with the generation of polynomial trajectories using B-splines to reduce the risk of running out of battery power. We explain the algorithms and implementation details in the following sections.

Re-planning to reduce the risk of collision
We solve the re-planning optimization task that was formalized as equations (25) (28) as an inverse problem using Differential Evolution (DE). DE is an optimization algorithm that is applicable to "non-differentiable continuous space functions", (Storn & Price (1995)). A number of studies have compared meta-heuristic algorithms like DE with classic path planning methods, or convex optimization approaches (Aggarwal & Kumar, 2020). The main advantage of this algorithm is that the evaluation of the candidate solutions can be parallelized, and after the first iteration of a single member of the population, we can measure approximately the time it takes to complete an epoch and adjust the maximum number of iterations allowed (based on the time available for replanning).
The goal of this re-planning task is to update the planned trajectory by modifying the reference coordinates that the UAV must follow. Therefore, the decision variables for the DE method are C intermediate reference way-points ij = {c i+1 , c i+2 , ..., c i=j that define the trajectory between the current location of the UAV and the final way-point. In this work, we assume the UAV flies at a pre-defined altitude, therefore, the flight coordinates of interest are the two continuous variables c i = x i , y i . The general idea behind DE is to generate new solutions by varying the properties of the members of the population by repetition of three operators: (1) mutation, (2) crossing, and (3) selection till convergence to an optimal solution (Das et al. (2016)). The solution model is usually expressed with the notation DE/p/q/r (p: type of the mutation vector, q: number of pairs of solutions to be used in order to vary the current solution, and r: distribution function that will be used during the crossover operation). We used the configuration DE/rand/1/bin in this paper, where a random vector (p = rand) and the difference of a vector pair (q = 1) were applied to generate a mutation vector. The binomial crossover model (r = bin) is used for generating a crossover between two solutions. Besides this configuration, the parameters that are adjusted for the DE method are: (1) a matrix that holds the upper and lower bounds for the values that each decision variable can take; (2) 2 < 2C⇥2 , the population size that defines the number of candidate solutions to the optimization problem ⌘; (3) the mutation factor µ; (4) the crossover rate ; and (5) the number of iterations (epochs) for the algorithm ↵.
We set a crossover rate of 0.9 to ensure a low influence of the current population solution on the newly generated ones, and a mutation factor of 0.6 to allow for the exploration of new solutions. The stopping criteria established are based on the optimization function threshold ⌧ saf e and the maximum number of iterations. If the risk threshold is satisfied the algorithm stops. The maximum number of iterations on the other hand is defined by a percentage of the time available to find a new solution T avail . One iteration of the algorithm must be run to define the approximate time it takes the evolution of one generation, such that the maximum number of iterations can be defined properly. Another advantage of this algorithm is that the evaluation of the candidate solutions (which is the operation that demands most computational power) can be run in parallel. The population of individuals is set to 20.
The evaluation of the candidate solutions is implemented as a dynamic simulation of the UAV for its remaining trajectory to derive f max c ( ij ). The larger the trajectory to be re-planned the longer it will take the simulation to converge to "good" solutions. Since we are implementing an online re-planning problem, it is much more efficient to re-plan the segments of the trajectory associated with high risk of collision¿ in other words, we find a new trajectory from the current point to the next "safe" way point. If the risk of collision is high for multiple segments of the trajectory the re-planning can be applied sequentially to each segment starting from the one closest to the current position of the UAV. In this way, the trajectory re-planning can be dynamically generated while the UAV is flying, but the overall solution, while safe, may not be globally optimal.
6.2. Re-planning to reduce the risk of running out of battery power The trajectory re-planning routine to reduce the risk of running out of battery power is a three-step procedure. First, a dynamically feasible path between a future reference coordinate in the remaining trajectory and an emergency landing spot must be computed. To accomplish this task, we first generate the shortest collision-free trajectory using Probabilistic Roadmap path planning (PRM) algorithm (Kavraki et al. (1996)) and transform that trajectory into a dynamically smmoth trajectory using B-splines (Farin (2014)). Second, we evaluate each possible trajectory by performing a stochastic simulation considering the current UAV and environmental conditions. We select the shortest trajectory that meets the condition F max c < ⌧ fail . A probabilistic roadmap is a graph structure, where the nodes represent collision-free way-points and the edges represent a realizable path between these way-points (Kavraki et al. (1996)). The PRM graph generates random samples from the configuration space and tests them for whether they represent feasible paths. The graph is created by connecting the set of random samples typically using the k-nearest neighbors method. Only the connections smaller than a predetermined distance in the free space are kept. This is the first phase of the algorithm, which is executed off-line based on the navigation map. In the query phase, the start and goal coordinates are connected to the closest nodes of the graph, and the path is obtained by using Dijkstra's shortest path algorithm.
The PRM algorithm extends the capability of similar algorithms, such as rapidly-exploring random trees because it does not generate a new graph when a new node is discovered. One advantage of this algorithm is that it is provably probabilistically complete (as the number of samples increases without bound, the probability that the algorithm will find a path if one exists approaches 1). A thorough evaluation of this algorithm is provided in Hsu et al. (2006). B-spline curves are used for generating a smooth trajectory based on the way-points derived by the PRM method for a desired cruise speed. A B-spline curve can be defined as a piece-wise curve, where each component is of degree k. This method has been widely used in robotic applications because of its desirable properties of convex hull and maintaining continuity up to the k + 1 derivative for a curve of degree k. Specifically, we considered cubic B-splines (Farin (2014)) with a desired cruise speed of 1 m/s.

EXPERIMENTAL STUDY
We demonstrate the feasibility of the proposed online re-planning framework for two different re-planning scenarios. For these experiments, we operated the GPS and IMU systems at sampling rates of 1 Hz and 100 Hz, respectively. The corresponding accuracy of the position and velocity of these devices were 3 m and 0.1 m/s, respectively. An example of the accuracy of the SPKF state estimator for the estimation of the UAV position is shown in Figure 5. The standard deviation of the state error covariance matrix corresponding to the position estimates is used to define the distribution of D n obst k when calculating the risk of collision with obstacles.
Because of the Euler angles formulation, singularity problems might occur in the rotation matrix. This phenomenon is known as gimbal lock (Diebel, 2006). Gimbal lock occurs when two rotation axes align, thus losing one degree of freedom. Under normal operating conditions, the octocopter should not experience such a large pitch. However, to ensure that this phenomenon does not occur in our simulations we saturate the output of the position controller (in the cascade control scheme) to guarantee a smooth pitch reference. This requires smooth reference trajectories and we ensure that by setting the desired reference speed of the vehicle to 1 m/s when running the B-splines smoothing method.If more aggressive trajectories were to be simulated, we would switch to the quaternion-based representation (Kehlenbeck, 2014). Figure 6 shows the initial trajectory between the start and goal points chosen by the UAV under nominal wind conditions (indicated by the red ad yellow dots in the figure). Figure 7 plots the computed risk of collision for each time step. We considered a dsaf e k = 1m considering the octocopter dimensions and set the maximum allowed risk of mission failure ⌧ fail = 0.8. Therefore, re-planning is not required given that the maximum risk does not go over 0.4. The differential evolution algorithm is applied sequentially to the reference trajectory coordinates to compute a new trajectory. Several parts of the remaining trajectory are updated as shown in Figure 10, which plots the old and the new trajectory. The result shows that the trajectory is modified in the opposite direction of the obstacles. Figure 11 shows the trajectory followed by the UAV considering the new reference coordinates. The risk associated with this trajectory is shown in Figure 12. As shown, the risk of collision goes below the safety threshold, ⌧ saf e = 0.6. The goal of the re-planning strategy is to reduce the risk to acceptable levels such that the probability of collision stays below a certain threshold. Although re-planning should not increase risk (as it occurs between time step 80 and 100), a compromise has to be established when the UAV flights in a degraded condition because In the second example, trajectory re-planning occurs when the risk of mission failure exceeds the pre-specified threshold. This increased risk is attribute to low battery charge. An example of the accuracy of the SPKF state estimator for the estimation of the battery voltage and state of charge through a complete discharge cycle is shown in Figure 13. The prediction module predicts the RUL depending on the battery SoC by propagating the future inputs and considering the uncertainty estimates at each time step.
The re-planning routine is triggered when the time-to-failure is close to the time required to complete the mission, and the corresponding risk of failure (i.e., crashing) exceeds the predefined threshold. This situation occurs after the UAV has covered part of the mission trajectory shown in Figure 14 (blue trajectory).To avoid a crash, the UAV has to land at the closest emergency landing site. We assume hovering is pos- sible, and calculate a new trajectory using the PRM planner (which computes off-board) from the current UAV location to the nearest emergency landing spot. A time-stamped smooth trajectory is obtained by using cubic B-splines on the reference coordinates. The realization of the replanned trajectory is shown in Figure 14 (green trajectory).

Discussion
The goal of the proposed re-planning approach is to reduce the likelihood of collisions to an acceptable level. If this is not achievable then an emergency landing spot has to be identified, and a new trajectory is planned. The allowed computation time T avail depends on the communication latency and the time-to-collision (approximate flying time from the current position of the UAV to the set of continuous way-points during which the probability of collision is not acceptable. Therefore, re-planning for updating the current trajectory and completing the mission will be possible if the segment to be corrected is far enough. Moreover, if the allowed computation time is feasible, achieving the desired minimum probability of collision will depend on the hardware on which DE is running. The primary computation effort of the optimization method can be attributed to the evaluation of the cost function, where the dynamics of the UAV are simulated. This evaluation (for the DE algorithm) can be parallelized using a vector population, where the stochastic perturbation of the population vectors happens independently. If there are segments (between pairs of waypoints) in the future trajectory for which the probability of collision is not acceptable, each segment will be re-planned sequentially starting with the one closest to the current position of the UAV. The larger the segment the larger the computational cost. In this work, the feasibility of the proposed approach is demonstrated, but no validation steps are conducted. This would require real-time experiments. We will conduct them in future work.

CONCLUSIONS AND FUTURE WORK
A decision-making and path planning framework to support safe low-flying urban operations of small load-bearing unmanned aerial vehicles (UAV) flights has been proposed in this work. The core of the proposed approach includes four modules: (1) state estimation, (2) prediction, (3) mission failure assessment, and (4) trajectory replanning. The first module comprises a set of estimators for the state variables that characterize the behavior of the UAV. We specifically estimate the pose of the UAV and state of charge of the battery because these are important variables that determine the success of a mission. The prediction module propagates the es-timated states, considering the uncertainty bounds to determine the behavior of the UAV along the remaining trajectory at each time step. The mission failure assessment module perform risk analysis based on the predicted behavior and activates the trajectory replanning routine if the risk of flight exceeds a pre-defined threshold. The trajectory replanning module triggers one of two possible routines depending on the cause, i.e., (1) a high risk of collision or (2) insufficient battery charge to complete the mission.
We tested the proposed framework using a realistic octocopter model and incorporated environmental (primarily wind) conditions in our analyses. The proposed approach allows us to compute new trajectories that reduce the risk of mission failure to acceptable values. In the future, we plan to investigate more sophisticated planning approaches based on Markov Decision Processes (MDPs). We also plan to integrate a system-level prognosis module Khorasgani et al. (2016) that allows to track the degradation of individual components on the UAV, and determines the RUL of the UAV system based on performance parameters. The system-level prognostics algorithms will also allow us to track UAV state of health across multiple missions. Figure 14. Trajectory re-planning scenario when the risk of running out of battery is high