Efficient Sensor Placement for Health Management of a Rocket Engine Using Monte Carlo Simulation

Efficient sensor selection for health monitoring of complex systems is studied for leakage detection and diagnosis in a reusable liquid rocket engine developed in Japan. The contribution of this study is bridging data-driven and model-based approach for anomaly detection and sensor placement with Monte Carlo simulation. Training data that includes variations of engine operating conditions were obtained from simulations with reduced order models. Although leakage could not be detected by conventional univariate red-line judgement, a multivariate data-driven analysis detected the simulated leaks. In the analysis, multiple sensor measurements are linearly projected onto a vector that characterizes the distribution of normal and anomalous data. The number of sensors is optimized based on how well leakage was detected. It was found that only 13 out of 31 sensors were sufficient to maintain the detection performance. The leakage was located by evaluating the difference in outputs of selected sensors between normal cases and those with leakage.


INTRODUCTION
Health management based on sensor measurements is indispensable for safe operation of complex industrial systems (Manohar, Brunton, Kutz, & Brunton, 2018). The number of sensors that can be installed in such systems is usually limited, especially in aerospace systems where sensor number and placement is severely restricted due to cost and because of inadequate downlink capacity to ground systems. In apply-Noriyasu Omata et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
ing machine learning to system health management, a smaller number of sensors is also preferable because of the curse of dimensionality. Therefore, optimization of sensor selection is an important part of the design process required for health management of aerospace systems.
According to Maul, Kopasakis, Santi, Sowers, and Chicatelli (2008), there are generally four performance requirements in sensor selection: observability, reliability, fault detectability, and cost. They stated that research into sensor optimization rests on expressing each requirement as an algorithm for evaluation. This paper focuses on the third requirement, fault detectability, to optimize sensor placement.
Anomaly detection can be roughly classified into data-driven and model-based approaches. Data-driven approaches include a supervised approach that refers to abnormal data and an unsupervised approach that does not. One of the biggest problems in the supervised approach is that failures rarely occur in industrial systems and it is generally difficult to obtain enough abnormal data for training. The unsupervised approach detects anomalies by measuring deviations from normal data. This has the advantage that it is easy to obtain required data and that it can cope with any anomaly because the given data does not limit the type of anomaly cases. However, the disadvantage is that it is unclear which part of the data is characteristic of the anomaly because the anomaly is unknown to the method.
The model-based approach has recently become a promising candidate for health management because numerical simulations can model both normal and abnormal conditions (as shown in Kawatsu, Tsutsumi, Hirabayashi, & Sato, 2020 characterize normal and abnormal conditions, then the abnormality is sought by comparing the similarity of measured data to the reference data. The advantage of this approach lies in the conformity of the model to physical reality, and its disadvantage is that it cannot be applied without manual modeling. In addition, simulation based on the first principle is a deterministic method and cannot account for variations of operating states or noise. The health management method developed in this study employs both the model-based and data-driven approaches and performs data-driven fault detection on the data obtained from model-based simulations. This approach has not been often considered because model-based simulations could not generate enough data for data-driven methods due to their high computational costs. This study solves the problem using Monte Carlo simulations with reduced order modeling (called system level simulations, or SLS).
A reusable liquid-propellant rocket engine, the RSR engine, was developed by the Japan Aerospace Exploration Agency (JAXA) and is used as an example of a complex system. Leakage of cryogenic propellants from pipe connections is one of the major failure modes which lead to critical accidents, so a high cost is paid for inspection and maintenance between flights and static-firing tests. However, there is little difference between leaked sensor values and normal sensor values, and detection of fuel leakage is hardly possible with conventional red-line judgement giving a threshold value. This study focuses on fuel leakage in the RSR engine, and a methodology for selecting sensors crucial for the leakage detection is presented.

THE RSR ENGINE
The RSR engine employed in this study was developed in the Reusable Sounding Rocket (RSR) program (Sato et al., 2014; Kimura et al., 2016). The aim of this program is to realize a reusable rocket that can launch a payload to an altitude of 100 km, return to the launch site, then be ready to fly again within 24 hours. To meet these requirements, the RSR engine is equipped with the following features: 1. capability of wide-range throttling; 2. accurate control of the operational sequence; 3. long-lasting durability; and 4. fast, advanced health monitoring.
A schematic diagram of the RSR engine is shown in Figure 1, and its specifications are shown in Table 1. Propellants for the RSR engine are liquid hydrogen (LH2) and liquid oxygen (LOX). The yellow line in Figure 1 represents the cryogenic liquid hydrogen line, and the red line is the line for hydrogen heated by the combustor in regenerative cooling. The blue line is the oxygen line. This engine employs the expander bleed cycle, and a fuel turbopump (FTP) and oxidizer turbopump (OTP) are driven by the heated hydrogen after regenerative cooling. The rocket engine is a complex system composed of many components and many valves. As many as 60 sensors are installed to monitor the engine condition. The sampling rate of these sensors is 100 Hz.
The RSR engine has been subjected to various operational se-  quences in static firing tests to verify its performance (Kimura et al., 2016). Only the steady-state combustion sequence is considered in this study. In the engine sequence, 40 % thrust was maintained for 8 seconds after ignition and then 100 % thrust for 120 seconds. The measurements taken for this sequence are shown by dotted black lines in Figure 2 for three typical sensors (PC, TT1F, and QCO1). It is found that a steady combustion state is reached about 20 seconds after ignition according to all other sensors.

SYSTEM-LEVEL SIMULATION OF THE RSR ENGINE
As mentioned, the RSR engine needs to be re-prepared within a short time. However, manual health management of an engine between flights is time-consuming, so fast and accurate failure prediction and diagnosis has been developed for the RSR engine utilizing machine learning techniques (Tsutsumi et al., 2019;Sato et al., 2019). These methods utilize unsupervised learning for data-driven anomaly detection.
A large amount of data is required for supervised data-driven failure detection. Obtaining sufficient data from experiments is not realistic because of the time and costs involved. Highfidelity simulation, such as computational fluid dynamics (CFD), is another way to generate training data but the physical models required for simulating a liquid rocket engine include two-phase flow, chemical reactions, heat transfer, etc. Using high-fidelity simulation to generate training data in this way is also unrealistic.
One solution that has been developed is the SLS model (Satoh, Tsutsumi, Hirabayashi, Kawatsu, & Kimura, 2020). The SLS model for an RSR engine calculates the pertinent behavior of the system and returns the time-series values of n = 31 sensors placed in various parts of the engine, such as FTP, OTP, combustor, and regenerative cooling. Among these 31 sensors, 15 are pressure sensors, 11 are temperature sensors, and the remaining sensors measure flow rate, rotation speed, etc. The SLS decomposes the entire engine into multiple components including pipes, orifices, pumps, and combustors, and reduced order modeling is performed to each component.
One example of such model is the calculation of the mass flow rate through an orifice: whereṁ is the mass flow rate, R the resistance coefficient, ρ the density, A the cross-sectional area, ∆P the pressure difference, and k dp the correction coefficients (i.e., a parameter in the model). As for the orifice models, each component have correction coefficients for adjusting their resistance and performance. Let θ s represent the set of all correction coefficients.
Pump performance is determined from the state calculated from the rotation speed, the flow rate, and the Suter curve (Suter, 1966) that is obtained in advance to represent its performance. The combustion of fuel and oxidizer in the combustor is calculated by a chemical equilibrium program (McBride, Zehe, & Gordon, 2002). The amount of heat exchanged between the fuel and the regenerative cooling channel can be obtained from the heat transfer coefficient of the combustion gas calculated using the Bartz equation (Bartz, 1968). Details of the SLS model and the parameter θ s have been summarized in the literature (Satoh et al., 2020).
In this study, leakage of the fuel and oxidizer is included in the model. Leakage is assumed to occur from one of the four pipes shown in Figure 1: the FTP turbine inlet (Leak 1), regenerative cooling outlet (Leak 2), fuel-side igniter inlet (Leak 3), and OTP pump outlet (Leak 4). Leaking of fuel and oxidizer is simulated by the orifice model shown as Eq. 1. The parameters for the leakage model are determined as follows.
Assuming leakage into the atmosphere, the pressure difference from atmospheric pressure is ∆P . The values of k dp and R in this model are determined according to Kobayashi, Naruo, Maru, Takesaki, and Miyanabe (2018), which examined in detail the outflow coefficient of high-pressure hydrogen leaked into the atmosphere. The magnitude of the leakage is adjusted by the area of hole A, and one of four patterns from 1π to 4π mm 2 is used. The leakage is assumed to occur before engine ignition. Sudden leaks that may occur during engine operation are not considered here because they can be detected by a sudden change in a sensor value.
Based on the static-firing test result for the operational sequence of stationary combustion shown in Figure 2, the best model parameterθ s in the SLS was obtained using an ensemble Kalman filter (Satoh et al., 2020). The simulation results for leakage of A = 4π mm 2 at four locations usingθ s are also shown in Figure 2 as blue, yellow, green and red lines, respectively. There is little difference to be seen in behavior shown as solid lines for the normal and leakage cases. It is evident that detecting leaks using these sensor values is almost impossible.
The detailed behavior when the sensor values have settled are shown in the enlarged view, the right side of Figure 2. The experimental values have fluctuations due to system and sensor noise not modeled in the simulation. The mean of the experimental values and that of the simulation are in subtle disagreement, which reflects the shortcomings of reduced order SLS modeling. Even worse, these effects are larger than the small changes due to the leaks, and it is impossible to detect the leaks by comparison with the SLS results of normal and leakage cases without further measures.

MONTE CARLO SIMULATION UNDER NORMAL AND ABNORMAL CONDITIONS
As shown in Figure 2, measurements show fluctuations due to sensor noise and system noise, and they vary for such reasons as the test environment and the state of the system, even if the same tests are performed normally. Simulation results under identical conditions are uniquely determined but, because the SLS model is based on a reduced order model, the result of an SLS simulation does not always agree perfectly with actual measurements. The variations and deviations in model-based anomaly detection have not been explicitly considered as yet.
In order to carry out model-based anomaly detection considering the above variations and to make data-driven methods applicable, Monte Carlo simulation is performed to reproduce the realistic variations. In the Monte Carlo simulation, variations are introduced to the SLS result using the model parameter θ s . Random variations δθ s are added to the determined model parameterθ s as and varied simulation results within the range of normal operation are generated. Each element of δθ s is chosen independently from a uniform distribution to give a ±10 % variance to the model parameters. In addition, data outside the normal design range for the RSR engine are excluded. Monte Carlo simulation was run for 10,000 cases of which N = 8, 800 were normal cases.
Monte Carlo simulation was also carried out on models with leakage to generate a variety of abnormal data. There were 210 θ s parameters, randomly selected from those for normal 8,800 cases. For these selected parameters, four leak locations and sizes (described in the previous section) were considered for a total of M = 210 × 4 × 4 = 3, 360 abnormal cases to be calculated.
The results of the Monte Carlo simulation in normal cases are shown in Figure 2 as gray lines. The sensor value varies within the normal design range, and the difference for the experimental conditions is reproduced. The range of variation in this Monte Carlo simulation is sufficiently larger than the range of fluctuation in the experimental values and the difference between the experimental and simulation values. It is shown that the Monte Carlo simulation sufficiently absorbs the difference between the experimental and the simulation values observed in Figure 2.
The results of the simulations with leakage are within the range of the Monte Carlo simulation results without leakage, as shown in Figure 2. This indicates that the normal variation larger than the effect of the leaks is sufficiently modeled. Figure 3 shows the histograms of the three sensor values at the end of the simulated sequence for both normal and leakage cases. The variation of the sensor value is almost the same, and the leaking of fuel and oxidizer cannot be detected by conventional red-line judgment.   Figure 2 shows there is no large fluctuation in the sensor values after the simulation reaches steady state with 100 % thrust. This paper focuses on leak detection from sensor values in the steady state as an initial stage of development, and simulated sensor values at t = 128 s:

DEMONSTRATION OF LEAKAGE DETECTION
are used in the following discussion. Results without leakage are represented as x N (θ s ). When there is a leak, there are two additional parameters: the magnitude A and the location l of the leak. The result can be written x A (θ s , A, l). The task of anomaly detection in this study is to determine whether the sensor value during steady combustion x originates from x N (θ s ) or x A (θ s , A, l).
An example of leakage detection using multiple sensor values is shown below for two sensors (n = 2). Figure 4 shows the multidimensional space of stationary sensor values where the normal cases are in the normal space S N and the anomaly cases are in the anomaly space S A . Focusing on a sensor, the ranges where the sensor value normally exists is indicated by the dashed black line, and the range for anomalous cases is indicated by the dashed red line. Since the dashed red line overlaps the dashed black line for both sensors, univariate anomaly detection such as red-line judgement is impossible.
If we can obtain the axis shown by the blue line, all anomalies can be detected by using both of these sensors. This discrimination can be achieved by projective transformation of the multiple sensor values onto the projection vector p, which characterizes the distribution of the normal and anomalous data. The projection result can be expressed as and indicates how anomalous the sensor values are and is called the anomaly score.
The problem is how to determine the projection vector p and calculate the anomaly score. Assuming that the anomaly space S A is a space shifted in the direction of anomaly vector a from the normal space S N , one of the criteria to optimize p is considered as follows. First, the projection with the anomaly vector p · a is maximized to increase the effect of anomalous conditions on the anomaly score. The anomaly vector a is obtained as the mean of the difference between normal and anomaly sensor values: On the other hand, the projection of the normal space on the vector p·S N is minimized so that the effect of variation under normal conditions is small. Here, the projection p · S N is defined by the standard deviation of the normal sensor values projected onto p: The projection vector p is obtained to minimize the ratio of the two projective results: The result of this optimization can be written analytically as where Σ N is the covariance matrix of the normal sensor values: This formulation is almost equivalent to Fisher's linear discriminant analysis (LDA) (Fisher, 1936) and is also applicable when there are more than two sensors.
One may consider using anomaly vector a as projection vector p. However, as shown in Figure 4(b), the leakage and the normal cases overlap, and there are indeterminate cases. This is because vector a contains not only the direction of  As shown in Figure 4(b), however, a similar overlap exists, and leakage again cannot be detected. Since PCA searches for the direction of maximum variance in normal (and abnormal) cases, the component of normal variation is included in the first principal component.
According to Eqs. 4-9, the projection vector p was obtained from the result of the Monte Carlo simulations, and the anomaly score s of each simulated case was obtained by projecting the sensor data x. Note that the sensor data here is normalized by subtracting the averaged value for normal casesx N . The histograms of the anomaly score for the normal and abnormal cases are shown in Figure 5. The average value of the anomaly score in the normal case is zero due to normalization. Leaks from the LH2 line, Leak 1, 2, and 3, show a similar tendency, a distribution different from that of the normal case, although there is a small overlap with the normal case. The projection results for Leak 4, leakage from the LOX line, are all out of the distribution for the normal case. The four peaks in this distribution correspond to the size of the leak hole A. The results obtained here indicate that leakage can be detected by using a linear transformation in the multidimensional space of sensor values.

SENSOR SELECTION BASED ON FAULT DETECTION PERFORMANCE
In this section, reduction of the number of the sensor is investigated by optimizing the combination of sensors so that the performance of leakage detection is maximized. There are 2 n available combinations of sensors, and it is time-consuming to evaluate the performance for all combinations by brute force. Therefore, a greedy approach is employed, and sensors are removed one by one.
The procedure for the greedy method for sensor optimization is as follows. First, one sensor is removed from the classifier that uses all n sensors. There are n candidate sensors to be removed, and, n classifiers using n − 1 sensors can be obtained by trying all of them. The performance score of detection is calculated for each classifier, and the classifier with the highest score is adopted as the classifier using n − 1 sensors. Next, another sensor is removed from the obtained n − 1 sensor classifier. All of the n − 1 candidates are tested, and the one with the highest score is adopted as the classifier using n − 2 sensors. By repeating the same procedure, the number of sensors can be reduced one by one.
Instead of the linear projective transformation, a linear support vector machine (SVM) is employed as the classifier. Linear SVM is a general supervised machine learning method for discrimination that can derive a data-adapted transformation. For convenience, let y = 0 indicate the sensor value x for the normal case, and y = 1 indicate the result for the anomaly case. The linear SVM uses (a part of) the sensor value x as input and y as output, and discriminates positive and negative examples by the following linear transformation: where p and b are parameters in SVM. Note that other methods such as decision tree, neural network, discriminant analysis in the previous section are also available for discrimination.
In the implementation of an SVM, 800 normal cases out of 8,800 and 480 abnormal cases out of 3,360 were used as test data, all the rest as training data. The area under the curve (AUC) score of the receiver operating characteristic (ROC) curve (ROC-AUC score) was used as the detection performance score. Instead of directly using the sensor value x, the input data to the SVM was standardized so that the mean and standard deviation of each sensor value were 0 and 1, respectively. Hyperparameters of the SVM were determined by cross-validation using the ROC-AUC score. The SVM and cross-validation programs are implemented by scikit-learn, a widely used open-source Python library for machine learning.
The ROC curves for leakage detection using the SVM are shown in Figure 6. The ROC curve for more than 10 sensors show a similar performance, and roughly 80 % of the faults can be detected from the sensor data with a false alarm rate of about 10 %. However, the ROC curve for 9 sensors is generally lower than the other curves, and its performance is obviously poor. Figure 7 summarizes the relationship between the number of sensors and the ROC-AUC score. It is indicated that the performance is almost the same for using all n = 31 sensors and using only 13 sensors. For models with the number of sensors between 13 and 31, the performance is slightly better than that using all sensors, even though the number of sensors is reduced. This reflects the resolution of the curse of dimensionality. Further reduction in the number of sensors from 11 results in severe degradation of performance. Although the interpretation of the combination of the selected 13 sensors are not yet known, the advantage of the efficient sensor selection based on the data-driven method is evident.

FAULT DIAGNOSIS USING SELECTED SENSORS
In a complex engine system such as shown in Figure 1, it was a significant advance to realize effective leakage detection. To reduce maintenance and cost, identifying the location of a leak is an important part of fault diagnosis.
The diagnosis of leakage is tried using the selected sensors. Two sensors, TT1F and QCO1, are selected, and the differences of the stationary value between normal and abnormal cases are evaluated, and then scattered as in Figure 8. According to the location of the leak, ∆x is classified into four groups. Therefore, by constructing a multi-class classifier using ∆x, finding the location of a leak is possible.
When this approach is applied to flight data or static firing test results, normal data corresponding to the engine condition when the anomaly occurs is necessary. However, as shown in Figure 2, it is currently difficult to distinguish the engine condition and obtain normal data. Further study is required for better diagnosis of leakage, and demonstration of this method also is a future issue.

CONCLUSION
Leaks of cryogenic fuel and oxidizer from pipe connections is one of the major failure modes of a reusable liquid rocket engine. To reduce maintenance and cost, efficient sensor placement for leakage detection and diagnosis was studied. To avoid expensive experiments on complex systems, the training data (including variations in engine operation) was obtained by Monte Carlo simulations using reduced order modeling.
It was demonstrated that leaks could be detected by performing the appropriate linear projective transformation for the stationary values of multiple sensors, while conventional redline judgement could not. Based on the performance of leakage detection using a linear SVM, the sensor placement was optimized by the greedy approach. It was shown that the ROC-AUC score could be maintained using only 13 of the 31 sensors. Using only some of the selected sensors, the location of the leak could be identified from the difference between normal and abnormal cases. It is required to validate of the proposed methods in the future, and there is a plan to conduct experimental hydrogen leakage tests to validate the SLS modeling and the methods.

ACKNOWLEDGMENT
This work was supported by JSPS KAKENHI grant number JP20K14954.

NOMENCLATURE
A area of the leakage hole b bias parameter in SVM k dp correction coefficient in the orifice model l indicator for location of leakagė m mass flow rate M the number of anomaly cases n the number of sensors N the number of normal cases p, p i projection vector and its component P pressure R resistance coefficient in the orifice model s anomaly score S A , S N space of sensor values for anomaly and normal cases x, x i vector of sensor values and its component x A , x N sensor values for anomaly and normal casē x A ,x N averaged sensor values for anomaly and normal cases θ s set of all parameters for SLŜ θ s the most likely parameters for SLS ρ density Σ N covariance matrix of normal sensor values