Unsupervised Health Indicator Construction via Deep Reinforcement Learning with Terminal-Dominant Reward

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Jan 13, 2026
Zeqi Wei Zhibin Zhao Ruqiang Yan

Abstract

In industrial intelligent maintenance, the construction of a reliable health indicator (HI) is crucial for accurate degradation assessment and fault prediction. However, existing methods face two major limitations: fusion-based approaches often suffer from low-quality or irrelevant features that degrade the discriminative capability of the HI, while reconstruction-based approaches rely heavily on high-quality healthy data, which is difficult to obtain in real-world scenarios. To overcome these challenges, this paper proposes an Unsupervised Terminal-Dominant framework for HI construction (UTD-HI). The method does not rely on remaining useful life (RUL) labels or pre-defined thresholds. Within a deep reinforcement learning (DRL) paradigm, UTD-HI learns an adaptive feature-weighting policy that suppresses irrelevant features and enhances informative ones. A reward mechanism integrating monotonicity, smoothness, and a sparse terminal constraint is designed, while hindsight experience replay (HER) is introduced to address reward sparsity. Furthermore, by employing different reward strategies in normal and abnormal stages, the framework can automatically and accurately distinguish between healthy and degraded operating conditions. Experimental results on the XJTU-SY bearing dataset demonstrate that the proposed method constructs HIs with superior trendability, monotonicity, and robustness across different operating conditions, thereby offering a practical solution for HI construction in real-world environments.

Abstract 24 | PDF Downloads 26

##plugins.themes.bootstrap3.article.details##

Keywords

Deep reinforcement leaning; health indicator

Section
Regular Session Papers