Underlying Probability Measure Approximated by Monte Carlo Simulations in Event Prognostics

The prognostic of events, and particularly of failures, is a key step towards allowing preventive decision-making, as in the case of predictive maintenance in Industry 4.0, for example. However, the occurrence time of a future event is subject to uncertainty, so it is natural to think of it as a random variable. In this regard, the default procedure (benchmark) to compute its probability distribution is empirical, through Monte Carlo simulations. Nonetheless, the analytic expression for the probability distribution of the occurrence time of any future event was presented and demonstrated in a recent publication. In this article it is established a direct relationship between these empirical and analytical procedures. It is shown that Monte Carlo simulations numerically approximate the analytically known probability measure when the future event is triggered by the crossing of a threshold.


NOMENCLATURE
Dirac delta distribution located at x.
Indicator function of an arbitrary set A.

INTRODUCTION
Event prognostics is a cross-cutting problem in science and engineering, where the notion of an "event" depends on the specific application (Redner, 2001).However, the general framework consists of having a variable of interest whose dynamics may depend on various factors and sources of uncertainty, and where the occurrence of the event is declared once this variable crosses a threshold for the first time (Siegert, 1951).Thus, given an initial condition for this variable of interest, the prognostic problem is about determining at what time in the future the corresponding event would be triggered.Naturally, if the dynamics of this variable is subject to sources of uncertainty, this implies that the time of occurrence of the future event should be a random variable, and its characterization would necessarily require calculating its probability distribution.
The most widely used method to prognosticate an event is the method of Monte Carlo simulations (Metropolis & Ulam, 1949).For applications where computational time is not an issue, this method is appropriate provided it guarantees stochastic convergence when the number of simulations tends to infinity or is "large enough" (this notion depends on the particular application), although it remains computationally expensive.For the same reason as above, it would be advisable to use an alternative method in applications where computing time is a limited resource.However, even in those cases, Monte Carlo simulations are very relevant, since they establish the benchmark or "ground truth" against which the performance of other methods can be measured (Tamssaouet, Nguyen, Medjaher, & Orchard, 2021;Wei et al., 2021;Zhang, Xiong, He, & Pecht, 2019;Sreenuch, Alghassi, Perinpanayagam, & Xie, 2014;Le Son, Fouladirad, Barros, Levrat, & Iung, 2013;Zio & Peloni, 2011).
Monte Carlo simulations allow the approximation of expectations with arbitrary precision (it depends on the number of simulations; the more simulations, the higher precision).Consequently, it is natural to ask what is the actual analytical expression for the expected value that is approximated when employing these Monte Carlo simulations in event prognostics.This analytical expression was recently reported in the literature (Acuña-Ureta, Orchard, & Wheeler, 2021), where the problem of event prognostic is posed within a general framework, establishing the analytical form of the probability distribution of the time of occurrence of a future event together with its corresponding mathematical demonstration.
Even the criteria for declaration of events is generalized, allowing the incorporation of uncertainty in it.In other words, the crossing of a threshold does not necessarily declare an event, but its declaration may be described by an uncertain event likelihood function, attributing the notion of probability that an event will occur or not given the condition of a system.Nonetheless, the relationship between the empirical approach based on Monte Carlo simulations, which is very commonly used, and this analytical expression for the probability distribution recently mentioned is not evident, especially considering the mathematical rigor with which the probability distribution was originally presented.
Understanding the relationship between the already standardized Monte Carlo simulations to perform prognostics (or to validate prognostic algorithms) and the Theory of Uncertain Event Prognosis (Acuña-Ureta et al., 2021), which is the contribution of this article, is crucial for the advancement in research related to prognostics.Among the most essential reasons for acknowledging this relationship are the following: 1.A formal framework for event prognostics based on mathematics is recognized.This framework gives a theoretical foundation to the notion of uncertain hazard zones (Orchard & Vachtsevanos, 2009), which has been widely known for years but never formalized.
2. Notions of convergence in prognostic algorithms arise.This applies to model-based and data-driven approaches since the framework is agnostic about how predictions are generated.
3. The formality of the Theory of Uncertain Event Prognosis gives rise to the generation of objective standards in terms of performance metrics of prognostic algorithms.
4. Enormous practical advantages have already been shown, like dramatically speeding up stochastic convergence when computing the occurrence time of a future event, reducing the computational cost, as evidenced in (Acuña-Ureta & Orchard, 2022b), where there is computation time reduction from the scale of hours by using standard Monte Carlo simulations, to the scale of milliseconds by leveraging transformations using the Theory of Uncertain Event Prognosis.
The results in this article are meant to express the direct relationship between the Theory of Uncertain Event Prognosis and the conventional way in which prognostic results have been validated using Monte Carlo simulations; thus, no case study is provided here.To check the validity of these results with a simple and illustrative case study, please refer to (Acuña-Ureta & Orchard, 2022a), where it is shown that the Theory of Uncertain Event Prognosis leads to the same probability distribution for the first occurrence time of an event that can be obtained by performing Monte Carlo simulations.
Alternatively, to check a more sophisticated application with a real application, readers can be referred to (Acuña-Ureta & Orchard, 2022b).
This article is structured as follows.In Section 2, a brief review of the Theory of Uncertain Event Prognosis is made, where the analytical expression for the probability distribution of the occurrence time of a future event is shown.Section 3 presents the main contribution of this article, which consists in establishing a clear relationship between this analytical expression and the traditional empirical way in which the method of Monte Carlo simulations is used in event prognostics.Finally, conclusions are presented in Section 4.

THEORY OF UNCERTAIN EVENT PROGNOSIS
Before presenting the probability measure of the occurrence time of a future event (Acuña-Ureta et al., 2021), it is necessary to make some definitions: 1.A stochastic process {X k } k∈N depicting the random trajectory of a variable of interest (subjected to uncertainty sources).
3. A threshold x, whose crossing by the variable of interest triggers the event occurrence.
With all these definitions and denoting k p as the present time, the first time in which the event E occurs can be defined as (Daigle & Goebel, 2013) Since the threshold crossing (event occurrence) depends on {X k } k∈N , which is a stochastic process, then τ E is a random variable.How do we calculate P(τ E = •) then?The first step is to give meaning to the particular event to be predicted.For example, if it were a failure prognostic problem, we could do it as follows: E = "System failure". (2) At each time k, E might either occur or not, with some probability.We can define a binary stochastic process {E k } k∈N such that, for each k ∈ N, E k = e k ∈ {E, E c } and, therefore, where E c is the complement of E, and is thus associated with the non-occurrence of the event "System failure" in this case.
We want to determine the first occurrence time of E, denoted as τ E = τ E (k p ), which can be now formally defined as Although this definition is similar to others in the literature (Daigle & Goebel, 2013), it does not make explicit the underlying probability distribution of τ E .According to (Acuña-Ureta et al., 2021), this probability distribution is actually given by In simple words, this expression states that the probability of the event occurring for the first time in a future instant of time k, k > k p , can be computed from averaging all the possible future trajectories x kp+1:k , evaluating on the one hand how likely it is that there is system failure at time k, expressed through the term P (E k = E|x k ), and on the other hand that it has not occurred before, expressed through the term 1−P (E j = E|x j ) .

PROBABILITY MEASURE APPROXIMATED BY MONTE CARLO SIMULATIONS
Given that Monte Carlo simulations are transversally accepted as a standard method to compute the probability distribution of the first occurrence time of a future event, the following pedagogically illustrates how these simulations approximate the probability measure of the Theory of Uncertain Event Prognosis presented in Section 2, particularly in Eq. (5).
Starting from an initial health condition x kp , where k p is the present time, we can simulate N ∈ N independent identically distributed (i.i.d.) realizations of the stochastic process {X k } k>kp .This is, each realization corresponds to a randomly generated sequence of values for the variable of interest as a function of time, as illustrated in Fig. 1.With these simulations we can adopt a frequentist approach and approximate P(τ E = k) as the frequency with which these realizations hit the threshold x for the first time at time k.That is, By the Law of Large Numbers, it is known that the previous approximation in Eq. ( 6) turns into equality when N → +∞.
This approach is popularly known as the application of the method of Monte Carlo simulations to the event prognostic problem.It follows from the descriptive definition of τ E , but, what is the underlying analytic expression for the expected value that is being approximated by the method?The method was originally developed to approximate expectations, so immediately we can figure out it is approximating an integral.
The answer actually corresponds to a particular scenario of the probability distribution reported within the Theory of Uncertain Event Prognosis shown in Eq. ( 5), as shown below.
To approximate P(τ E = k), it is required to simulate N i.i.d.realizations of {X k } k>kp , which consists on drawing samples from the joint probability distribution p(x kp+1:k ) (see Fig. 1 for an illustrative example of a single realization).Each realization corresponds to a trajectory followed by x k along time.In other words, it is a sequence of possible future values of the variable of interest, expressing its evolution over time.
The i-th simulated trajectory can be denoted as where x kp+1:k (i) ∼ p x kp+1:k , and the supra-index i ∈ {1, 2, . . ., N } denotes the specific realization of the stochastic process.Therefore, p(x kp+1:k ) can be weakly approximated in mathematical terms as where δ x kp +1:k (i) (x kp+1:k ) is the Dirac delta located at x kp+1:k (i) .By definition, this Dirac delta has two properties: • δ x kp +1:k (i) (x kp+1:k ) = +∞ , x kp+1:k = x kp+1:k This is why, due to the linearity property of integrals, we have It is important to note that Eq. ( 8) denotes a weak approximation.That is, the approximation is valid for the calculation of expected values.
To count how many of those trajectories hit the threshold at time k we can use the indicator function, which is defined as follows.Let A be an arbitrary set.An indicator function of hitting the threshold at time k # Total amount of realizations simulated .