Learning an Optimal Operational Strategy for Service Life Extension of Gear Wheels with Double Deep Q Networks



Published Nov 24, 2021
Mark Henss Yvonne Gretzinger Tamer Tevetoglu Maximilian Posner Bernd Bertsche


One failure mechanism of gear wheels is pitting. If the gear wheel is case hardened, pitting degradation dominates normally at one tooth only. All the other teeth are still intact at the standardized end of life criterion of 4 % pitting area based on the total tooth area.

Using an operational strategy that was developed at the Institute of Machine Components, the service life of gear wheels can be extended by a local stress reduction at the weakest tooth. This is accomplished by applying an adapted torque at the transmission input that shifts a minimum torque in the area of the pre-damaged, and thus, weakest tooth. Consequently, all remaining teeth with higher load bearing capacity are subjected to higher torque. Prerequisite for the described theoretical operational strategy is knowledge on pitting-size and -position. The detection of these properties in operation is not state of the art yet.

In this work, only the gearbox vibration signal is known without explicit knowledge about the inside pitting. So the challenge is to determine the health for each individual tooth and to choose an optimal adapted torque based on this. This is especially difficult due to differing growth rates of pittings on one individual gear wheel. Hence, different pittings dominate over the service life, which results in the need of a continuous optimization of the torque control.

Algorithms of Reinforcement Learning (RL) are particularly suitable for this challenge. In this branch of Machine Learning (ML), an agent interacts inside an environment and learns by getting rewards for taking actions at given states. In this study, the environment is a gearbox-simulation-model, the state is the current vibration signal, and the action is the chosen adapted torque. Thus, it is possible to let the algorithm learn the whole operational strategy, from online failure detection to an adapted torque at the transmission input.

The results of this study show the theoretical feasibility of the operational strategy using Double Deep Q Networks as the RL Algorithm. The algorithm is able to learn a suitable reaction to pittings that increase linearly or progressively at an early stage and therefore delays their growth within the defined limits. Thus, the lifetime of the gearbox is extended while maintaining the same total power of the gearbox. As an outlook, the results will be examined for their sensitivity on several influencing factors in a further study. The wider view is to use this simulation on a test rig and validate the results.

How to Cite

Henss, M., Gretzinger, Y., Tevetoglu, T., Posner, M., & Bertsche, B. (2021). Learning an Optimal Operational Strategy for Service Life Extension of Gear Wheels with Double Deep Q Networks. Annual Conference of the PHM Society, 13(1). https://doi.org/10.36001/phmconf.2021.v13i1.2978
Abstract 216 | PDF Downloads 130



gearbox, pitting, operational strategy, reinforcement learning, deep q networks, service life

Technical Papers