Assessment of Overhaul Effectiveness and Usage-based Inference using Bayesian Networks

The process of assessment of effectiveness of the existing overhaul practices determined that the historical usage of assets provides valuable contextual information. Usage data is typically highly reliable, but not in legacy fleets, featuring older vehicles with missing, incomplete, inconsistent, and contradictory data. This paper describes two methods for usage estimation from noisy data by exploiting two data sources: 1) unreliable, manually-entered usage data and 2) part replacements. The first method employs a probabilistic model to reconcile missing and inconsistent data entries; the second is based on the replacement of consumable components. The probabilistic model, fully and uniquely specified by the probabilistic variables (with their distributions) and deterministic variables, is validated using synthetic datasets because the real ground truth associated with the field data does not exist. Disproportional impact of an incorrect initial data point is mitigated by training the model in both forward and reverse directions. The motivating hypothesis for usage estimation from part replacements is based on a plausible assumption that specific consumables, e.g. brake pads, have reasonably repeatable replacement patterns which can be related to usage. For many vehicles mean time between failures of a component was even longer than the average data collection time span. But for assets with sufficiently longer data records, the cumulative replacements of components are well-correlated with the probabilistic usage estimates, providing additional reinforcement for the inference.

tional failure of assets, i.e.RCM identifies the maintenance needs for CBM (Nowlan & Heap, 1978;Moubray, 1997).CBM is the application and integration of selected processes, technologies, and knowledge-based capabilities to improve the reliability and maintenance effectiveness of systems and components (US-AMRDEC, 2016).Being based upon objective evidence of equipment degradation or impending failure, CBM has significant economic and safety benefits; it reduces incidence of unscheduled failures and downtime, and the occurrence of unnecessary or early scheduled maintenance.
Health or condition monitoring is the process of collecting asset data and extracting the information for CBM.Affordable sensors, data storage, and networking enable comprehensive monitoring of all types of assets.In order to make this data actionable for CBM, specific models are necessary to identify and characterize anomalies to relate the anomalous patterns to forward looking failure risk for decision making purposes (Engel, Gilmartin, Bongort, & Hess, 2000;Goebel et al., 2017).The models are typically classified as expert-system, physics-based, data-driven, and hybrid (Vachtsevanos, Lewis, Roemer, Hess, & Wu, 2006).
Health monitoring is often an incremental process, as data is typically not available to develop comprehensive diagnostic and prognostic algorithms from the outset.Instead, the levels of Prognostic Health Management (PHM) capability grow over time (Sikorska, Hodkiewicz, & Ma, 2011;Bussey, Nenadic, Ardis, & Thurston, 2014).Consolidation of vehicle fleet data in a data warehouse provides an opportunity to develop PHM knowledge and algorithms incrementally.The current study stems from an initiative to reevaluate the effectiveness of some of the existing maintenance practices.A specific objective of high importance was to assess the effectiveness of periodic overhauls based on existing data.The data came from a family of military tracked vehicles.The approach was pragmatic, based upon the available data, and available domain knowledge.
Knowledge of assets' usage has the potential to improve the assessment.However, because the usage data was unreliable (contained inconsistent and often missing values), an infer-ence model was needed.A probabilistic networks model was selected because networks of Bayesian probability calculations are an effective way to impute missing data and improve statistical models of time-series data.Multiple imputation is an effective way to fill missing data while maintaining a degree of uncertainty (Honaker & King, 2010).Parameter estimation in statistical models can be used to build a belief network for correcting imperfect observations in data mining problems (Denoeux, 2013).Probability was first interpreted as a natural extension to formal logic in the 1940s (Cox, 1946(Cox, , 1961)).This approach was later extended and popularized by Jaynes (Jaynes, 2003) and others ( (Sivia, 1996;Gregory, 2005).
The problem with probabilistic models was their computational requirements.The Markov chain Monte Carlo (MCMC) method, that also originated in the 1940s (Metropolis & Ulam, 1949), has become a standard tool for this computation.Probabilistic (graphical) models had notable successes in the sixties and seventies, but then fell out of favor to some other AI approaches (e.g.fuzzy logic (Zadeh, 1965(Zadeh, , 1983)), Dempster-Shafer evidence theory (Shafer et al., 1976)).The renaissance of probabilistic modelling started in the late 1980s with theoretical development (Pearl, 1988), but computational power did not allow the explosion of solutions that we have witnessed in the very recent past.The graphical approaches will allow domain experts to take advantage of the framework.Many new applications including medical diagnostics, analysis of genetic and genomic data, speech recognition, natural language processing, analysis of market data, and fault diagnosis, which can be extended to reliability.The potential of the probability networks has been recognized in the 1980s (Pearl, 1987;Geman & Geman, 1984)).More recently excellent tutorials (e.g.(Andrieu, De Freitas, Doucet, & Jordan, 2003;Diaconis, 2009) and books (e.g.(Jordan, 1998), (Bishop, 2006), (Barber, 2012), (Koller & Friedman, 2009), (Gelman, Carlin, Stern, & Rubin, 2014), and (Theodoridis, 2015)) have become available.

Approach based on the available data
For the fleet of interest, the maintenance history data was available, but the operational history was almost entirely lacking.While the knowledge of the operational history had the potential to greatly improve the decisions, the decisions had to be made even in the absence of important data.
A way to demonstrate effectiveness of previous vehicle overhaul using maintenance data alone was to visualize the cost of maintenance over time.Figure 1 shows a hypothesized maintenance cost C m as a function of time t, where cost of maintenance was computed as the sum of the cost of parts C parts and the cost of labor C labor where the cost of labor C labor is a product of time associated with active repair t labor expressed in hours (labor hours) and hourly rate r labor The tacit assumption in this figure was that the usage and the general operating and environmental conditions did not change appreciably over the time associated with an overhaul cycle.In practice, this assumption had to be verified, as shown in Section 3. Within the existing database, parts cost was generally available.However, the number of parts replaced were sometimes misleading and the recorded labor hours were often missing, requiring hand cleaning of data for analysis.

Maintenance Application Overhaul
The motivation of this study was in reducing the cost of overhauls by choosing more appropriate, condition-based maintenance times across the fleet.Figure 2 shows maintenance and overhaul activity on a fleet of more than four hundred vehicles.The solid (dark blue) line shows overhaul dates, the faded orange background signifies the time interval with recorded maintenance history corresponding to the overhauled vehicle and is populated with dark-orange dots that denote maintenance events, and the dashed grey line marks six months after the overhauls.Note that there are generally few maintenance events in that six months interval after overhaul.
In order to assess the effectiveness of current overhauls, data was arranged so that all vehicle's overhaul dates aligned and cost records before and after could be compared.The purpose of this visualization is to compare cost, availability, and number of repairs relative to overhaul, across all vehicles.Cost and maintenance histories were analyzed by sub-system (i.e a collection of components categorized by their collective purpose).Components that were hypothesized to have replacement rates most influenced by usage, such as road wheels, were analyzed individually (Section 3.2).
A cost analysis was done for the 425 vehicles that have been overhauled since 2007 and data was grouped by 90 day intervals, as shown in Figure 3. From top to bottom, the figure shows a) total cost, b) available number of vehicles in the sample population, c) cost per available number of vehicles, and d) number of maintenance orders with non-zero cost.Grey and orange shading were used to emphasize the comparison of activity before and after the overhaul dateline.
In order to avoid effects from the volatility of price variation over time, all costs shown in this plot were based on the 2014 unit prices using information on current parts.All vehicles in the sample contained the overhaul event.However there were fewer vehicles that had observation time intervals long before and long after the overhaul event.To take into the account the number of vehicles available at a time away from the overhaul event, the cost per vehicle is shown in Figure 3c.
Figure 3b shows the availability of vehicles.The available number of vehicles was computed by taking the earliest and latest maintenance date for each vehicle and counting the number of vehicles available in the days relative to overhauls.The earliest maintenance dates were used to count number of available vehicles before overhaul and are represented by the grey shade.The latest maintenance dates were used to count available vehicles after overhaul and were represented initially by the blue line but later revised to the orange shade for the following reason.132 vehicles did not have maintenance records after their overhaul, hence the gap in Figure 3b between the end of the grey shade before overhaul and the start of the blue line after overhaul.The lack of vehicle data after overhaul was also apparent in the top right region of Figure 2, where orange data points were sparse.For Figure 3b, the count was revised from the difference between overhauls and the latest maintenance events to the difference between overhauls and the time of analysis, since some vehicles had no maintenance events after their overhaul date and before this study.
Figure 3a and Figure 3c shows reduced maintenance cost for period of time after overhaul.However, as the maintenance moves from time-based scheduling, it needs to consider asset usage.The remainder of the document is concerned with the usage estimation.

Best case scenario
The objective of this study was to obtain inferential estimates of usage based on noisy raw data.This study supports a broader goal to improve the overhaul interval utilizing condition and usage information, which requires accurate usage estimates.Clearly, maintenance needs depend on the context, defined by operating and environmental conditions a vehicle is subjected to.Thus, a careful analysis of vehicle overhaul effectiveness for a given vehicle must consider the context of use; however, this type of data was not recorded.Instead, the consideration of vehicles collectively has the potential to reveal the impact of overhaul on the continuous vehicle maintenance.
Figure 4 shows factors that contribute to the state of health of a vehicle.Two main classes of influence are identified as the operational history and the maintenance history.
Operational history is determined by usage, operating conditions, and environmental conditions.The usage is typically measured in miles, engine hours, or both.Operating conditions include in-use/storage patterns, braking, sudden accelerations, engine and vehicle speed distributions, and other similar parameters.Environmental conditions include terrain, humidity, exposure to salt water, dust, etc.The dominant approach for capturing usage data is the inclusion of health and usage monitoring system (HUMS), defined as a system of sensors, processes, and algorithms for prognostics, on the vehicle platform (Heine & Barker, 2007).HUMS applications have been deployed only on the most expensive vehicles, such as fix-wing aircraft (Trammel, Vossler, & Feldmann, 1997) and rotor craft (Gordon, 1991;Ellerbrock, Shanthakumaran, & Halmos, 1999).Ground vehicles are generally not equipped with HUMS because HUMS adds on cost and complexity (with potential decrease of reliability), but it has been explored in research and development e.g.(Heine & Barker, 2007;Rabeno & Bounds, 2009;Das, Hall, Patel, McNamara, & Todd, 2012).The older tracked vehicles considered in this study were not equipped with HUMS.Maintenance history includes current issues, past repairs, and cost of maintenance over time.

USAGE ESTIMATION
Of the operational history data described in Figure 4, only usage data existed in the available data set.The maintenance history database contained a field called usage, and for the fleet studies this value is supposed to represent cumulative engine hours (as read from the engine hour meter) or total miles.The units of recorded usage were unclear at first.Due to the noisy nature of the data, they may have represented hours driven, or miles traveled.Either unit should have nondecreasing values over time, but the volatility of the observations made it difficult to determine which were most plausible.Furthermore, observations were far enough apart in time that logical reasoning was not enough to distinguish the units (e.g.usage changing by five in one hour would imply miles, not hours).Further investigation of ordinance vehicle monthly and daily log-books strongly suggested the units of usage were hours.Of the 32 vehicle log-books provided to us, several held records that precisely matched that of the database for the vehicle under units of hours.However, usage data had many missing, inconsistent, and contradictory values.An illustration of this situation is shown in Figure 5, i u i should be monotonically non-decreasing, except from a rare event of reset (e.g. after an engine replacement).However, the recorded usage data was non-monotonic, with only the trend appearing to adhere to the expected behavior.
Two methods for estimating actual usage were considered: 1) development of a probabilistic model to reconcile the missing and inconsistent values, and 2) estimation of usage based on replacement of consumable parts.The two approaches are described in turn.

Probabilistic Model for Usage
MCMC is a sampling method that draws likelihood measurements from simulated random data conditioned by some predefined measured data.It utilizes a high dimensional search space where certain positions (representing a set of parameter values for the variables being estimated) have a greater likelihood than others.In order to search the parameter space, MCMC generates correlated random numbers from the current values by stepping to a new position of higher likelihood in parameter space by use of the gradient field (or probability density).This requires an initial guess to start the chain and is part of the conditioning.

First-Order Implementation
Figure 6 shows a simple probabilistic model used here: u i and u i+1 are successive actual usages that are unknown, while M i and M i+1 are the corresponding recorded measurements.The observations are shaded, following the convention of (Bishop, 2006)).The model is constructed as (3) so that actual usage u i , distorted by noise n i produces measurements M i .The noise was measured in hours modeled as uniform between 0 and t max (set to 50 to accommodate high errors encountered in the data) The only assumption to model the unknown amount of usage between two measurements was that usage is monotonically non-decreasing.The usages were modeled as deterministic variables where ∆t i was a known time interval between two samples, and b i was a positive usage factor.It was important to keep these two factors separate rather than lumping them into a single parameter.As two parameters, they better related to the basic domain knowledge, allowing for time to be expressed in days, which improved the model's overall interpret-ability.In practice the usage was not recorded daily and the usage factor b i , being a small fraction of a day, was modeled as the Beta distribution, where Γ(.) is the gamma function.The parameters α and β were empirically selected to 1 and 4, respectively, to model a larger likelihood to values closer to 0, as shown in Figure 7.
Because of the recursive nature of Eq. 5, the first usage data point had to be modeled separately The separate treatment of the first usage point gave rise to undesirable behavior in cases where the first point contained unusually large error.The approach to fully address this problem is further explained in Section 3.1.2.However, because the probabilistic model aimed to maximize the likelihood of the entire fit it was less dependent on the initial estimate's accuracy as the number of samples increased.In addition, in cases where physically-impossible outliers were encountered (vehicle was used more than 24 hours per day), as evidenced by the condition the noise was allowed to sample beyond the (0,50) limit (refer to the lines 30-32 in the listing provided in the Appendix).
The results of an estimation using the simple model implemented in Stan are shown in Figure 8).The orange line traces the means of the estimated distributions of usage ûi .After running the program, each measured data point had a probability density function (PDF) assigned to it, built by the model.Every value in the PDF represented one of the accepted samples in the Markov chain.The mean of these PDFs was chosen to represent the estimate because it denotes the average value of all the samples.The mode was considered but not used due to the possibility of a value that may not have accurately represented the best estimate to that point being the value that the stepping method hit more often.This was one example of how the thinning and burn-in sampling parameters were useful.Thinning reduced samples where the stepping method got stuck, and burn-in removed earlier samples that may not have contributed to the final convergence.
While this simple form of a probabilistic network seems particularly well-suited for the problem at hand (featuring relatively few data points with potentially large errors), based on our literature search, we believe that it has not been used in the context of the usage estimation before.In addition, the assignment of the probability distributions, which is critical to model performance was described in detail.As other similar approaches ((Bishop, 2006) shows the connections between Kalman filters, HMMs, and other probabilistic networks), it returns a full probability distribution for the data, the posterior.This allows for a much fuller understanding of the uncertainty of the results.A script for the Stan implementation is provided in the Appendix.

Initial Value Dependence and Looping
The choice of a Beta distribution for the slope of the model described in the previous section imposed the limits for the cumulative usage, making it non-decreasing by design.One limitation of this approach was that the starting point has stronger impact than the subsequent measurements.It had no preceding data values to step up from so it had to be estimated some other way.It was estimated using a normal distribution with a user inputted standard deviation.This distribution could pick a value close to the data point or allow Figure 9. Forward and reversed model operated on the same data.
for variation when a bad initial value was recorded, however, the final fit was always biased by this initial estimate.When the first measured data point was greater than the successive points the overall fit tended to sit atop of the values near that data point.A good starting estimate led to a good overall fit.The quality of fit was determined by what was known.The only knowledge of this data was that it had to monotonically increase to make physical sense and that not all of the data was bad.If the estimates followed the monotonic rule and lied close to the plausible measured data, the fit was deemed valid.A reasonable data point was one that fell in line with the surrounding data in an increasing fashion.When the first data point was reasonable, the fit was reasonable.When the first data point was unreasonable, the fit was considerably less reasonable near that data point.Figure 9 shows two fits to some real usage data taken from the vehicle database.The forward fit worked from the beginning to end of the time data, and the reverse fit worked from end to beginning.Because the first data point was noisy, the reverse fit was a better estimate overall as it started from a reasonable data value.The orange line represents the average Bayesian fit evaluated from the first data point to the last.The red dashed line represents the average Bayesian fit evaluated from the last data point to the first.When evaluating in this direction, because the first point was noisy but the last point was not, the reverse fit looked more plausible.
It was hypothesized that the bias of the initial point would affect the final fit less when the model was run many times forward and reverse through the data and the results were averaged.The process was as follows: The first run estimated the first point with a normal approximation and fit the rest of the data.Then, taking the average value of the last data point's estimate, it started the process over again in reverse, this time starting with the endpoint.Once the reverse direction fit was computed, the program averaged that last point estimated (which was our starting point from the beginning) and continued again in the forward direction, this time having a different starting point than at first run.The only part of the code that needed to be changed in order to accommodate this condition was the deterministic expression shown in Eq. ( 5) which, in the reverse direction, was changed to and a feature to read the previous run's last estimate to start the next run.A new piece of data (boolean) which declared the direction of fit was added as well.
The fit behaved better where surrounding data existed than at the initial value and if the data were evaluated in the reverse direction (end to beginning), the estimate of the first value was more reasonable than the normal approximation described above.This was because the reverse direction would end on that first value and if it were an unreasonable value the preceding points would guide the fit better than if it started at that value.The idea was that this pattern would continue as more loops were made forward and backwards.That method did not show any significant decrease in bias, however, averaging the estimates of a single forward pass and a single (independent) reverse pass did decrease the bias experienced from the first point.It is important to note for this final method, the reverse fit was evaluated independently of the forward fit, unlike the looping procedure.

Testing with Synthetic Data
In order to further test the effectiveness of the overall fit, synthetic data was created so that true values could be known for a visual comparison.The objective of this estimator was to ultimately show the most plausible interpretation of noisy measurements with what prior knowledge of the data we had.Four cases were chosen to represent typical scenarios in the data and are shown in Figure 10.The first test case, Figure 10a, was intended to get a control set and see how the fit handled perfectly reasonable data.The crosses mark the true (synthetic) data.The dots mark the measured (observed) data.In this case the true data matched the measured data and so the final estimates showed much sharper peaks in their distribution.This verified that the model performed accurately with the most ideal case of measurements.
The second case, Figure 10b, included some noise added to the measurements.Notice here that some of the dots are not aligned with the crosses.The orange line, representing the mean trace of the posterior distributions, was almost identical to the first case.This showed that small instances of noise could be handled as if there was no noise.
The next case, Figure 10c, introduced larger noise to the true usage.Some measured points were above the true values and some were below, just like the second case.In this case the mean trace was still nearly identical to the first case, however, notice the uncertainty in the individual distributions has widened for those noisier observations.The fourth and final case, which introduced the effectiveness of the conditional free variable on noise when obvious outliers were detected, shows how the estimator handled two extreme outliers.It was not uncommon to see values like 12,345 and 99,999 in the data.Notice that in this final and extreme case the mean trace still did not move appreciably from the first case.The extreme outliers were basically ignored since such a large noise value was assigned to those estimates and the rest of the data fit just like the first case.
In practice all of these types of scenarios were prevalent in some combination or other in the measured data.Focusing on the mean trace between all these test cases shows the consistency this model had over different occurrences of noise.
When analyzing these cases, this was the most important factor.The consistency of the fit among different noise cases indicated the consistency of the model.The model consisted of basic prior domain knowledge of how the data should behave, and an understanding of when values are too un-realistic to be modeled like the rest.

Replacements of Consumables
In order to supplement usage estimation, replacements of the consumable parts were analyzed.Any ground vehicle has a set of parts that degrade as a function of usage: tires, breaks, etc.In military tracked vehicles some examples of consumable parts include sprockets, road wheels, shock absorbers, track adjusters, support rollers, and track idler assemblies.It was expected for a vehicle to have replacement rates proportional to usage rates.Figure 11 shows a hypothesized relationship between number of replaced parts and usage increments associated with maintenance intervals.
Vehicle cost and maintenance histories were analyzed by sub-system (i.e a collection of components categorized by their collective purpose).Components with known use-based degradation, such as road wheels, were analyzed individually.Replacement frequency and failure rate differences among vehicles were used to test correlation between consumable component's maintenance and vehicle's usage.
Maintenance orders record the part replaced as well as the number of parts replaced.Figure 12 shows the cumulative replacement of road wheels over time and estimated cumulative usage, using the model described in Section 3. The inset plot on the top axis in Figure 12 shows the change in usage between maintenance over the change in number of replacements.
The data for the number of replaced parts was inconsistent over the fleet.The majority of data for these vehicles contained few maintenance records.The average usage between maintenance for the road wheel component example in Figure 12 was approximately 230 hours.The average usage range for the part of the fleet that was analyzed (1,247 vehicles) was approximately 223 hours.
Since the fleet average was less than the road wheel average it was expected that the vehicles in this analysis contained data with no maintenance events during the usage period.This is illustrated in Figure 13.The histogram shows the number of road wheel replacement across the fleet: 85% of vehicles did not have any road wheel replacement records in the maintenance data, ∼9% of the vehicles have 1-2 wheel replacement, etc.The maintenance history of most vehicles was insufficiently long to capture the usage patterns.
Of the several consumable components, road wheels had the most replacement data.It was expected, however, that there be few replacements over the usage history of the vehicle.The jagged edges of the usage estimation line in Figure 12 indicate where usage records existed because usage was estimated with a probabilistic model that followed a piece-wise linear approximation between all usage events.If there were replacements at each of those times the vehicle would have been under constant repair, indicating a larger underlying problem with the vehicle.This is why the change in usage to change in number of replacements did not correlate well.
The inset plot on the bottom axis of Figure 12 shows cumulative usage over cumulative replacements.Here, the correlation coefficient was 0.95 indicating these values were highly correlated.Overall, usage was indicated by the number of replacements when long term accumulation of data was observed.
Figure 14 shows a count of the computed Pearson correlation coefficients for all vehicles that had usage records and replacements of road wheels.The average of the correlation between change in usage and change in replacements, ρ ∆ , was −0.07.The average of the correlation between cumulative usage and cumulative replacements, ρΣ, was 0.87.
Replacement data are more important to the maintainer than usage and so usage values were more likely to be entered incorrectly.Replacements occurred during maintenance events but usage readings were based on logging from the vehicle operators and so replacements were expected to be entered with greater care.
The vehicle shown in Figure 12 used the mean usage estimation from our probabilistic model in comparison to long term accumulation of replacements of the road wheel component.This was done for other vehicles as well.In Figure 12 matching usage and number of replacements in time is shown by Figure 14.Pearson correlation coefficients for all vehicles that had usage records and replacements of road wheels.
dashed lines connecting the values between the axes.These pairs were how the inset plots were generated in that figure.
Figure 15 shows cumulative (Σ) usage versus cumulative number of replacements for fourteen vehicles.The slopes of these lines did not agree very well.In principle, one could use replacements of consumables to support usage estimation.There are several reasons we would see such variation in slope in this case.The usage estimation may not have been accurate due to bad data from the start for some vehicles.The replacement data may have been incorrect as well (seems likely for the bottom-right line that goes out to 80 replacements after around 100 hours of use).The operating conditions of different vehicles may vary greatly depending on their location too.Referring back to Figure 4, unless you know the circumstances of usage it is hard to relate it to replacements.Suppose these vehicles operated under different location-based conditions that were known to the user.The differences in slopes attribute to different conditions.If a set of vehicles were pre-classified by the relation between usage and replacements then replacement data for new vehicles, that are known to be part of a specific group, could aid in usage estimation.Using the slope for that class of vehicle would support likelihood estimates in the form of prior knowledge in the Bayesian usage estimator.

CONCLUSION
The focus of this study was to use improved methods in estimating long term accumulation of data that may be inconsistent or missing.Usage records were analyzed from military tracked vehicles and the approach to estimating usage based on noisy data was split into two parts.1) development of a probabilistic model, and 2) estimation based on replacement of related consumable parts.The first method was a knowledge driven probabilistic model that estimated reasonable values for non-decreasing monotonic measurements that were inconsistent, incomplete, and sometimes missing in the data.The results were consistent for several cases of noise on the same data.We believe that this modeling approach, that requires minimal prior knowledge, has not been used for usage estimation.Because the ground truth was not available, we proposed and executed methods for evaluating performance of the model based on synthetic data in the absence of targets, including bi-directional passing through the data.The model was designed to estimate successive measurements and when applied to real usage data it was realized that noisy initial values biased the results.This was reconciled using an averaging method where the same data-set was sampled in a forward and reverse direction and the results averaged.This proved to significantly decrease the bias from a bad initial data point.
The second approach was intended to estimate usage as well but was found to be only supplementary at best.For the road wheel component, correlation between usage and replacements of consumables was very close to 0 from the maintenance to maintenance perspective, however, long term accumulation of data showed high correlation.This suggests that cumulative records of replacements of consumables related to usage can be used to aid in the estimation of usage itself.

ACKNOWLEDGMENTS
This work was made possible by the Department of the Navy, Office of Naval Research under Grant No. N00014-14-1-0789.The authors gratefully acknowledge the help of our colleagues and students from Rochester Institute of Technology: Timothy Murtaugh organized Excel files into a Microsoft SQL database, Scott Nichols help with formulating more intricate SQL queries, and Richard Price performed preliminary manual analysis of repairs.Their contributions are greatly appreciated.

DISCLAIMER
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Office of Naval Research.

Figure 2 .
Figure 2. Overhaul (dark line) and maintenance events over time for a fleet of vehicles.

Figure 3 .
Figure 3.Effect of overhaul in time measured in days: a) total cost, b) available vehicles, c) normalized cost (per vehicle), and d) number of repairs.

Figure 4 .
Figure 4. Data for making decisions on vehicle overhaul.

Figure 5 .
Figure 5. Usage data example: the values are not monotonically non-decreasing as expected

Figure 6 .
Figure 6.A simple probabilistic model
Figure 8. Usage estimation example

Figure 10 Figure 11 .
Figure 10.a) Case 1: Measured data equals true data.b) Case 2: Some noise added to measured data.c) Case 3: Large noise added to measured data.d) Case 4: Extreme outliers.

Figure 12 .
Figure 12.The two inset plots show the correlation between change in usage and change in replacements (top) and between cumulative usage and cumulative replacements (bottom).

Figure 13 .
Figure13.Only 17% of records for road wheels have one or more replacements associated with them.

Figure 15
Figure15.A comparison between hours of usage and number of road wheel replacements for fourteen vehicles.