Prognostics As-A-Service: A Scalable Cloud Architecture for Prognostics


 
 
Comprehensive aircraft system health-state awareness is crit- ical for maintaining safe, efficient growth in global oper- ations, enabling higher levels of autonomy, and facilitat- ing new forms of aviation. Maintainers, vehicle operators, air traffic controllers, dispatchers, pilots, autonomous sys- tems, and other decision-makers must have reliable real-time knowledge of the vehicle health, the health of its critical com- posite systems, predictions of how health changes with time, and forecasts of how its capabilities change with health degra- dation to preserve safety and efficiency. Providing this infor- mation in a reliable manner in computationally constrained environments and across a wide range of vehicles and sys- tems continues to be a challenge. This challenge can be par- tially resolved through cloud computing, where the execu- tion of prognostic and diagnostic algorithms is performed on a network of remote servers hosted on the internet. NASA is developing a cloud computing service, Prognostics As-A- Service (PaaS), that explores the feasibility and challenges of cloud-enhanced prognostics. Though such a system has broad applicability, this research effort is focused on aviation applications. 
 
 



INTRODUCTION
Comprehensive aircraft system health-state awareness is critical for maintaining safe, efficient growth in global operations as well as enabling higher levels of autonomy and new forms of aviation.Maintainers, operators, controllers, dispatchers, pilots, autonomous systems, and other decision makers must have reliable real-time knowledge of the vehicle health, health of its critical composite systems, predictions of how health changes with time, and predictions of how its capabili-ties change with health degradation in order to preserve safety and efficiency.Providing this information reliably in computationally constrained environments and across the wide range of vehicles and systems continues to be a challenge.This challenge can be partially resolved through leveraging of cloud computing.Leveraging external resources could enable aircraft with computationally constrained systems to gain improved efficiency and reduced lifecycle costs through resource sharing, and enables the use of new algorithms utilizing the large quantity of data aggregated from many users to provide better services to all.NASA is developing Prognostics As-A-Service, a cloud computing service for diagnostics and prognostics.While cloud computing architectures have many advantages, some challenges will need to be addressed, as described below: 1. Generality: PaaS must be capable of providing services across a wide spectrum of vehicle types and configurations.For prognostics, this requires flexible, configurable, generalized systems models, so as to describe the system of interest.Such generalized models often require extensive system characterization to derive model parameters.The challenges of generalization and parameter identification for generalized models are major technological barriers.2. Communications: Many aviation systems rely on complex and often limited communications systems.Users of the PaaS system will need to rely on prognostic predictions from PaaS, even in the presence of communication constraints (latency, bandwidth) or dropout from communication failures.3. Utility: A PaaS architecture must be capable of predicting with the precision, timeliness, and accuracy required for decision makers to take action to protect the safety and efficiency of the aircraft and others.These prediction attributes make up the Quality of Service (QoS) require-ments for a PaaS.Decision makers could be operating in real-time, such as UAS operators, pilots, autonomous pilots, air traffic controllers, etc., or they could be operating in a strategic manner, such as maintainers.4. Security: A PaaS architecture requires end-users to send information about the operation of that system over a network.Protecting the Confidentiality, Integrity, and Availability of that information and the prognostic estimates to the degree appropriate is a real challenge.5. Environmental Complexity: Future load prediction and system degradation prediction can both be a function of the environment it operates in.Inaccurate or incomplete understanding of the environment can lead to imprecise or inaccurate predictions.6. Trust: Predictions must be trusted in order to be used.
This means that they must be both trustworthy and that the end user must be convinced of its trustworthiness.
The Prognostics As-A-Service effort at NASA explores the feasibility and challenges of cloud-enhanced prognostics.Though such a system has wide applicability, this research effort was focused on aviation applications.This effort is exploring and demonstrating the ability to address the six major challenges of a PaaS architecture, described above.This paper details the PaaS architecture and describes its use in NASA projects.

SIMILAR ARCHITECTURES
Cloud computing is a topic at this year's IEEE International Conference on Prognostics and Health Management, demonstrating the elevated interest in cloud prognostics architectures.A number of cloud-based prognostics or health management architectures are proposed in the literature (Lee, 2013;Deb, 2013;Ning, Huang, Shen, & Di, 2013) Vitria is a company that sells an Internet of Things (IoT) Analytics Platform-as-a-Service that enables customers to utilize the Vitria analytics engine to monitor, detect, diagnose, analyze, and predict based on IoT data and configurations.The platform allows the customer to prioritize and classify incidents and to suggest automated actions to remediate and resolve anomalies and incidents.Customers can deploy their own algorithms and system models in this service platform.The service covered by this paper includes facilities to register and manage platforms, systems and components (defined in detail below), configure those entities, send data to the service, and retrieve events from the service.The registration and configuration steps only need to be done once for a given configuration.Once a complete platform is configured, the user can request that a session be started.Starting a session triggers the creation of the configured prognosers on the service backend.Once a session is active, the user can send data and periodically check for results.
The service does not include any kind of client-side user interface.It is our expectation that many end-users will prefer integration of the data provided by the service into their existing user interfaces over the addition of another separate interface to already crowded UAS ground station environments.We are also exploring possible graphical user interface designs as part of the System Wide Safety project.
As part of the service's operation, it naturally stores most data received and calculated in a database.In addition to the basic performance of prognostics, this also enables research by giving easy access to a large quantity of uniform data that can be used to tune and enhance prognostic algorithms.

SOFTWARE ARCHITECTURE DESCRIPTION
The Prognostics as a Service architecture builds on the Generic Software Architecture for Prognostics (GSAP) (Teubert et al., 2017) library.PaaS provides the infrastructure to store and manage data and configuration information associated with systems being prognosed, and to efficiently pass data to prognosers and results back to the user.
The PaaS prototype was developed over the last two years as part of the Convergent Aeronautic Solutions (CAS) Project's concept incubation process.The prototype system consists of a Prognostics Application Driver written in C++ that wraps GSAP in a thin communication framework that talks to the front-end process.The front-end process consists of a web server written in Java that exposes a RESTful API to endusers.The application also uses a PostgreSQL database to store prognostics data, configuration information, and application state.

Entities
PaaS models discrete prognostic problems as a hierarchy of entities that describe a complete set of things to be prognosed.

Platform
Each prognostic session is tied to a single platform.The platform represents a collection of prognostic targets to be analyzed together.The most common manifestation of a platform in our work is an unmanned aerial system (UAS).Each platform has one or more systems associated with it.

System
Each system represents a discrete object that can be analyzed by a single GSAP prognoser.The system is usually one half of the total representation of this object.It stores configuration information about the object that depends on the platform.The other half of the object is represented by the Component entity.

Component
Components represent a pluggable object that is used in a system.As an example, a UAS battery has a system that represents the requirement that the UAS have a battery attached during operation, while the specific battery used in any given flight is represented by a component.The component stores  The relationship between a user and a component/platform and between a platform and system represent ownership (e.g. a user owns a platform).The relation ship between systems and components represents assignment (a component is assigned to a system).The relationship between data points/events and systems/components represents an association (e.g. a data point is associated with a system) These entities represent the user's conceptual view of the service.They map closely to the database entities used to store data used by the service.The Platform entity maps to a GSAP process, a System/Component pair map to a prognoser and Data Points and Events map to individual messages passed within the service.
configuration information that is specific to a particular physical device.

Data Point
Each data point represents a single input value, such as a battery voltage or vehicle latitude.Data points are published individually to allow prognostic components (primarily observers and predictors) to aggregate only the data they require.

Event
Events represent a piece of information that may be of interest to the user.The primary events of interest are prognostic events that contain the results of a prediction.Additional events may be generated that do not relate to a specific prognostic result.These include things like status events that notify the user of the status of particular parts of the system.

Service Structure
PaaS is organized into three distinct layers that each interact with the adjacent layer or layers.The lowest layer is the Prognostics Application Driver, which creates and manages the lifetime of GSAP prognosers.The PAD is also responsi-ble for one half of the inter-process communication link that connects the C++ process running GSAP to the Java process running the REST server.The service layer is the core of the Java server.It handles the other half of the inter-process communication and handles all communication with the database.
The service layer receives sensor data from the REST API layer and passes that data to the PAD, and also receives prognostic results from the PAD that it stores in the database for retrieval via the REST API.Finally, the REST API layer is a thin wrapper over the service layer that exposes the service architecture to the world as a set of HTTP endpoints to which requests can be made.
Data is passed between parts of the application using a lightweight publish/subscribe system.This system allows data to be published in granular pieces allows prognostic components to aggregate exactly the data they need to perform their calculations.This model also enables a very simple model for asynchronous execution.

Prognostic Application Driver (PAD)
Within the main Java application, the execution layer consists of two sets of components, first a set of repositories that encapsulate database operations.These are primarily standard Spring JPA repositories.Second, the execution layer con- After initialization, the PAD's main function enters an infinite loop that reads messages from stdin until an "exit" message is received.During initialization, the PAD creates a Router object that intelligently subscribes to messages that need to be transferred to the front-end.
sists of a simple IPC messaging protocol and infrastructure for maintaining prognostics processes associated with each active session.
The current version of the GSAP framework implements a lightweight message bus architecture to enable the efficient routing of data between various parts of the application.This message bus architecture is also replicated in the service layer described below, and messages of interest are determined and serialized between the two busses as necessary.
On the other end of the IPC pipe, a C++ application provides a thin wrapper around the GSAP library that processes incoming messages, configures prognosers based on those messages and places incoming data on a message bus.It also monitors the message bus for prognostic events and passes those events back to the Java server application.

Service Layer
The service layer contains components that execute the main "business logic" of the system.This includes input validation, storage and retrieval of data from the database.The service layer also routes session data needed by the prognostics component to the prognostics component in the execution layer.

REST API Layer
Representational State Transfer (REST) is a web architecture that allows for both querying and updating of resources us- ing structured HTTP requests.The REST style provides a uniform stateless architecture that fits well into today's webcentric world.The API is not without limitation however.In particular, the REST format (and HTTP in general) do not provide any mechanism for push-based notifications.Due to this limitation, the current API requires that clients poll periodically to receive new events.

Data Flow
A typical user's workflow breaks down into three categories of operations.The user must perform certain one-time operations to create a workable set of components, must perform certain operations "pre-flight" to ensure that the pieces are configured correctly, and finally the user performs "in-flight" operations of sending data and receiving results.
When a user first registers, they must set up platforms, systems and components to tell the service about the things they wish to perform prognostics on.At a minimum, the user must create a single platform (if they only wish to do vehicle-level prognostics), but most users will also create subsystems (e.g.batteries) for each platform.Finally the user creates components, which represent the specific serial-numbered part that can be assigned to a system.These operations can be performed at any time, but only need to be performed once.Once a platform exists in the system, it is stored in the database forever.
Once initial configuration is complete, the user can start a session at any time.To start a session, the user must make sure that all systems have a component assigned and then make a request to start a session.Once the session is started, the user can send sensor data to the service, and receive events back either by polling an endpoint in the RESTful interface or via direct pushes in the MAVLINK interface.Once the session is concluded, the user sends an end session request to the service to indicate that the service can release resources associated with the session.

CASAS
The CASAS project was a NASA effort to demonstrate the integration of multiple intelligent systems technologies developed at NASA.As part of the project, PaaS was used as an operational battery health-monitoring tool.A Python client was developed to interface with the CASAS ground station software to collect battery voltage, current and temperature data in real time and publish that data to PaaS via an HTTP POST every second.The client simultaneously made an HTTP GET request each second to retrieve prognostic results from the server.
PaaS was integrated as one source of decision making in the project.To accomplish this, one of the UAS platforms flown by the project was set up in PaaS by creating a Platform representing the UAS and a System representing the battery powering the UAS's motors.A Component was also created for each of the flight batteries used by the project.During each test flight, the battery for the flight was assigned to the battery system and the vehicle ground station started sending battery sensor data to PaaS as soon as the vehicle was powered on.
The project successfully demonstrated the application of cloud-based battery prognostics, but execution was not without issue.We encountered significant safety challenges in deploying the service.The primary issue we encountered was a security concern by NASA's flight safety review board with connecting the UAS ground station to the network.To work around this, we had to slave a second computer to the active ground station and connect that machine to the network so that it could stream data to PaaS.

System Wide Safety (SWS)
The goal of the System Wide Safety (SWS) project is to provide system-wide, model-based predictive capabilities in the UAS domain to ensure overall system safety.This would potentially include areas such as weather, navigation or communication performance, population density models, vehicle system health, and more.Much of the work performed in the initial year of the project was a reimplementation of existing technologies that had been developed for the Real-Time System Monitoring (RTSM) project in order to provide similar assurances for commercial aircraft in the National airspace.
Because the initial RTSM implementation was developed as a demonstration of the technology, it did not provide a scalable or accessible solution for multiple aircraft and did not provide the functionality needed to collect aircraft telemetry from multiple data sources.The PaaS architecture was adopted in order to solve both of these shortcomings, and also provide the general framework needed to port over battery prognostics that were planned as one of the initial safety metrics for SWS.
An important ongoing deliverable of the SWS project is in supporting flight tests to investigate and demonstrate advancement of its goals under the technical challenge for In-Time Safety Nets for Emerging Operations.Specifically, these tests would demonstrate automated in-time risk identification and mitigation for small UAVs over the duration of the flight, and to that end the SWS project has been working towards integration of its technologies into the UAS Traffic Management (UTM) ecosystem as a Supplemental Data Service Provider (SDSP).The intent of the SDSP is to respond to queries from UAS operators in order to increase system awareness and overall safety in the airspace.In its current implementation, the SWS SDSP has focused on battery health monitoring and obstacle proximity information as the initial safety metrics being calculated.Using the RESTful web service provided by PaaS, the SWS project participated in UTM "Sprint 3" and "Sprint 4" integration activities using simulated flight data.Telemetry items such as latitude, longitude, altitude, speed, battery voltage, current and temperature were sent to the SDSP via the PaaS RESTful interface, and obstacle awareness (nearby building and trees) and battery health monitoring data were returned.
A second demonstration effort was also performed in concert with standalone flights at Langley Research Center in Virginia.This earlier, initial implementation specified MAVlink protocol messages to pack and unpack the required data being sent between the UAS ground station and SWS SDSP.Rather than utilizing the RESTful interface, this data was sent via a TCP port and relied on intermediate front-end Java code to translate between the MAVlink data messages and the internal PaaS API for storing data and performing stateful session management.
Integrating the battery monitoring and obstacle awareness metrics into PaaS each had different challenges.The battery health monitoring effort had already begun, and served as the primary metric to prove out the PaaS/GSAP model.This made it a natural fit for the SWS SDSP, and in general there were few issues with the integration.Although this integration went smoothly overall, we did find that the API documentation did not sufficiently specify the expected units for various telemetry items.As a result, we encountered several instances where the client would initially send the right data but in the wrong units (e.g.100mV rather than 0.1V).Another shortcoming in the use of PaaS is a relatively low ceiling for high-rate data.For all testing so far, the SWS project has specified a rate of 1Hz for all incoming telemetry.Higher data rates are desirable for some prognostic applications, but the nature of HTTP, especially over a secure connection may make this impractical.
The integration of obstacle awareness using PaaS proved to be slightly more difficult, or at least unwieldy in its implementation.Part of the reason for this is the PaaS treatment of "components" as the underlying basis for its system or platform health prediction.In viewing the system as a collection of components, it does not easily support those data inputs that describe the behavior of the system as a whole, or the relationship of other nearby systems or objects.While these deficiencies did not prevent PaaS from being a useful tool it sometimes added additional overhead.Having a more comprehensive schema or development of optional domainspecific libraries would be useful in solving this.Presently the GSAP component of PaaS is not being used for obstacle awareness.
Despite any shortcomings, leveraging the health management framework of PaaS proved to be extremely useful in providing the generalized prognostic tools as well as the front-end API and back-end database.Adoption of PaaS greatly simplified the process of getting the SWS server running and collecting user data from the customer in a short timeframe.It is hoped that the SWS project will be able to attract more customers and allow us to test the PaaS framework over hundreds of simultaneous users, potentially tracking thousands of objects.Scaling up to this size in a robust and reliable way would be the gold standard for proving PaaS in the field.

CONCLUSIONS/FUTURE WORK
Prognostics can inform decisions to protect the safety of a vehicle and reduce lifetime costs through predictive maintenance.The PaaS architecture has the potential to bring precision prognostics capabilities to vehicles that otherwise would not have access to such information or supplement any onboard capabilities.
The work on PaaS so far has partially addressed the 6 challenges outlined in the introduction: 1. Generality: PaaS is built on the GSAP library, and therefore inherits its generic prognostics architecture, including configurable and generic prognostics models.Additionally, the interface has been carefully designed to accept a wide range of data, and to produce its output in a general and extensible format.

Communications:
The PaaS architecture is built on RESTful principles, which are in turn built on the HTTP protocol.The stateless nature of HTTP requests gives the service an inherent resilience in the face of temporary communication loss.This is balanced against a certain amount of overhead involved in using JSON (a textbased format) for all data.In the future, we may need to develop binary data formats to maximize data throughput.
3. Utility: The PaaS architecture has been tested in two use cases described above.The architecture has and continues to evolve in response to those tests.
4. Security: Development of a security infrastructure is largely unmentioned in this paper, however much of the security will come from adherence to software industry standards.REST API's of the kind described here are increasingly common in the software world and there is a wide set of best practices to draw from.The service is available only over TLS, which provides encryption of data in flight, and we are developing authentication and authorization controls that adhere to both industry and NASA standards.

Environmental Complexity:
The SWS project is actively working on quantifying and refining our environmental models, including advanced trajectory generation (Corbetta, Banerjee, Okolo, Gorospe, & Luchinsky, 2019) and battery power estimation.
6. Trust: The PaaS architecture is not yet fully mature.Initial tests in the case studies above show promise, but the architecture is not yet ready for deployment in safetycritical applications.
Feedback from the CASAS and SWS projects show a potential for improvement in usability when addressing domains outside of component health management.Using more generalized system models has its advantages in being applicable to many domains, but also increases the burden of translating inputs and outputs to something useful.Perhaps allowing an end user to map a domain-specific ontology onto the generic PaaS interface would allow for a simplification of integration and promote further use, possibly in novel ways.
Further feedback from the CASAS project suggests that connecting vehicles directly to internet services is likely to receive pushback from those concerned with aircraft safety.Further investigation into how to demonstrate safe operation of a highly-connected UAS is necessary for PaaS to fully succeed.
The SWS project will continue to mature their safety metrics and safety assessment algorithms in the PaaS platform.
The PaaS architecture will be matured to better support these advancements.Additionally, the SWS project will test the PaaS architecture with their algorithms in flight tests, providing an additional opportunity to reveal and fix weaknesses in the PaaS architecture.Additionally, the authors are planning a study to better understand the stakeholder QoS requirements and to understand the performance of the prototype PaaS system.These will be used to identify the performance gaps, areas where performance does not meet requirements, for future work.

Figure 2 .
Figure2.A simplified representation of PaaS entities.The relationship between a user and a component/platform and between a platform and system represent ownership (e.g. a user owns a platform).The relation ship between systems and components represents assignment (a component is assigned to a system).The relationship between data points/events and systems/components represents an association (e.g. a data point is associated with a system) These entities represent the user's conceptual view of the service.They map closely to the database entities used to store data used by the service.The Platform entity maps to a GSAP process, a System/Component pair map to a prognoser and Data Points and Events map to individual messages passed within the service.

Figure 4 .
Figure 4.A high level overview of the sequence of requests used to set up and perform prognostics.

Figure 5 .
Figure 5.The 6 key challenges addressed by PaaS