An Intelligent Fleet Condition-Based Maintenance Decision Making Method Based on Multi-Agent

According to the demand for condition-based maintenance online decision making among a mission oriented fleet, an intelligent maintenance decision making method based on Multi-agent and heuristic rules is proposed. The process of condition-based maintenance within an aircraft fleet (each containing one or more Line Replaceable Modules) based on multiple maintenance thresholds is analyzed. Then the process is abstracted into a Multi-Agent Model, a 2-layer model structure containing host negotiation and independent negotiation is established, and the heuristic rules applied to global and local maintenance decision making is proposed. Based on Contract Net Protocol and the heuristic rules, the maintenance decision making algorithm is put forward. Finally, a fleet consisting of 10 aircrafts on a 3-wave continuous mission is illustrated to verify this method. Simulation results indicate that this method can improve the availability of the fleet, meet mission demands, rationalize the utilization of support resources and provide support for online maintenance decision making among a mission oriented fleet.


INTRODUCTION
When conducting a mission, an aircraft fleet consumes massive resources, especially maintenance manpower and resources.In practice, maintenance strategies usually combine the "fail and fix maintenance" with fixed preventive maintenance.The "fail and fix" strategy cannot prevent fatal accidents, which may endanger pilots' lives and reduce the mission availability, while fixed preventive maintenance strategy usually schedules excess maintenance actions to ensure availability, while ignoring the asynchronism of failures among a fleet and the shareability of maintenance resources, hence, cannot fully develop the overall efficiency of maintenance resources, causing a huge waste while cannot completely prevent failure (Jiang & Murthy, 2008).Besides, to ensure safety, a specific maintenance job is done at a specific site, which may lead to the incoordination between operational requirements and maintenance actions.In general, traditional "fail and fix" practice & fixed preventive maintenance practice are not completely suitable.
To tackle the difficult problem, Condition-Based Maintenance (CBM), which is based on the actual condition and development tendency of assets, is put forward (Bengtsson, 2004).The rapid development of Prognostics and Health Management (PHM) (Sun, Zeng, Kang & Pecht, 2012) approach and its application on battery (Goebel, Saha, Saxena, Celaya & Christophersen, 2008) and aero engine (Wen & Liu, 2011) makes CBM possible.In practice, an aircraft contains one or more Line Replaceable Modules (LRM) whose health condition development fit the deterioration process (Barata, Guedes, Marseguerra & Zio., 2002).PHM can help predict the Residual Useful Life (RUL) of deteriorating LRMs through condition monitoring, and help staff make maintenance decision.Through the application of PHM, a series of maintenance measures are provided in time, and the ideal CBM "need and fix" is achieved (Jardine, Lin & Banjevic, 2006).Moreover, since RUL can be estimated, maintenance actions can be performed dynamically according to operational requirements rather than in a fixed site.In a fleet, where maintenance tasks are heavy and resources are limited, the application of CBM can notably increase operational availability, reduce lifecycle costs and improve safety.
Traditional CBM is about safely extending maintenance intervals using PHM information, and is often applied to a single aircraft.Fleet oriented CBM, on the other hand, should consider many factors other than single aircraft CBM, such as mission requirement, maintenance teams, etc., to balance the whole fleet.Actually, the ideal process of fleet CBM is as follows: 1) Aircrafts obtain their PHM data.2) The PHM data is transferred to the maintenance center.3) The maintenance center makes maintenance decisions.4) The maintenance decisions are transferred to aircrafts and maintenance teams.5) Maintenance action.So the fleet CBM problem is actually an "online" decision making problem.Besides, the fleet maintenance strategy is the combination of maintenance strategies for every single aircraft.For each single aircraft, the problem is to find the most suitable time and team while balancing the whole fleet, which is actually a routing problem.Routing problem has already proved to be N-P hard (Garey & Johnson. 1979), which is difficult to obtain the optimal or satisfying solution with the increase of problem scale.At present, the main solutions to fleet CBM problem include 1. Mathematical programming: Doganay and Bohlin (2010)  But there are still shortages between those methods and dynamic environments where online maintenance decision making and scheduling is required when an aircraft fleet execute combat tasks.Especially in: 1.Those methods lack consideration into the relationship between the health condition of the entire fleet and that of a single aircraft, ignoring the potential shortage of maintenance resources, and the maintenance scheduling strategy is usually not optimal.2. Due to the uncertainty of tasks and variety of aircrafts' health condition, maintenance strategy needs to be generated according to mission demands, aircrafts' health condition and resource limits.Those methods lack consideration into online decision making.The fleet maintenance problem involves a lot of communication among aircrafts and maintenance teams, and Multi-Agent Modeling technique can imitate the communication and cooperation among agents to model complex systems (Budenske, Newhouse, Bonney & Wu, 2001), and has been successfully applied in many fields of manufacturing, especially dynamic and distributed scheduling problems.Through communication and cooperation can aircrafts and maintenance teams acquire the health condition of the whole fleet, and the working condition of maintenance teams.Meanwhile, the fleet maintenance problem is an N-P hard problem, and a common solution to N-P hard problems is heuristic searching.Heuristic rules can be integrated into agents to help overcome the N-P hardness, and is a guide to the intelligent allocation of maintenance tasks (Yang & Hu, 2007).In one word, Multi-Agent Modeling is suitable for solving the aircraft fleet maintenance problem.This paper is the application of Multi-Agent System (MAS) to aircraft fleet maintenance scheduling.In this article, the idea of MAS and heuristic rules is adopted, and the dynamic intelligent maintenance decision making among an aircraft fleet with multiple maintenance teams is achieved to provide technical support for the online maintenance decision making.The purpose of this paper is to propose a multi-agent model, which can not only react to dynamic events, but can also generate schedules for maintenance jobs, to help design a fleet maintenance Decision Support System (DSS).
The remainder of this paper is organized as follows.Section 2 presents the description of the fleet maintenance problem.In Section 3, the MAS model for fleet maintenance scheduling is described, where the heuristic rules are put forward.The algorithm in which the dynamic problem is solved and schedules are generated is discussed in Section 4. Section 5 provides a case study of a mission oriented aircraft fleet to demonstrate the proposed method.Finally, concluding remarks and further study are provided in Section 6.

FLEET CBM PROBLEM DESCRIPTION
Consider an aircraft fleet containing m aircrafts and n maintenance teams (n<m) face continuous combat missions, in which a single mission requires l aircrafts (l is dynamic and l≤m).Each aircraft contains p LRMs whose RUL can be estimated.All maintenance teams are of the same ability, namely the same LRM requires the same Mean Maintenance Time (MMT), while different LRMs require different MMTs.The basic assumptions of the problem are listed below.
1.The current mission is known, namely the upcoming mission and mission interval duration are known, while future missions are unknown.2. Consider in-site maintenance only, so maintenance method is "replace and repair", and parts are repaired as good as new, namely the RUL of replaced LRMs reach the top.3. The RUL of each LRM in each aircraft decreases with mission time, or RUL doesn't decrease without a mission.Moreover, due to the differences in historical missions, the initial RUL of different LRMs in different aircrafts are different.4. Spare parts in each team are sufficient, namely spare parts are always available whenever a maintenance task is required.5.The estimation of RUL is accurate, so the case in which wrong strategy led by wrong estimations won't occur.6.Each team can work on only one aircraft at one time, and each aircraft can be repaired by only one team at one time.After the whole fleet return from the previous mission, each aircraft checks its own health condition, estimating RULs and comparing the RULs with maintenance thresholds to decide a possible maintenance.There can be one or more threshold (Camci, Valentine & Navarra, 2007), and in this article two thresholds are required, namely the Required Maintenance Threshold τ and the Opportunistic Maintenance Threshold T. Those two thresholds divide the aircraft into three health states.When RUL≤τthe state is identified as the required maintenance state S 3 and a maintenance is required immediately.When RUL>T, the state is identified as the no maintenance state S 1 and no maintenance is scheduled.When RUL is between these two thresholds τ < RUL ≤ T, the state is identified as the opportunistic maintenance state S 2 and a possible maintenance task depends on the states of other aircrafts and the occupation of maintenance teams.T &τ can be set according to mission or by experience.For instance, τ must exceed the time before the aircraft returns from the next mission.The objective of this problem is to maximize the availability of the fleet while the number of maintenance actions is satisfactory, and the basic constraints of the problem are: 1.The number of available aircrafts heading for the upcoming mission r must satisfy r≥l.
2. The number of currently available teams s must satisfy s≤n.
According to the description towards the problem above, when the fleet return from the previous mission, each aircraft checks its own health state S t at the current time t 0 , and reports to the maintenance center.The maintenance center verifies all the reports, organizes and coordinates maintenance tasks guided by a set of heuristic rules, and allocate maintenance tasks to suitable maintenance teams.Maintenance teams then execute maintenance tasks according to the maintenance center.When a maintenance task finishes, the fleet wait to execute the upcoming mission.
Each aircraft in the fleet will be repaired according to its condition.To all aircrafts, the combination of all maintenance decisions within the whole fleet forms a group of fleet CBM strategies aimed at utilizing the RULs of all aircrafts and the idle time of maintenance teams, in order to rationalize maintenance resources within the whole fleet.

THE FLEET CBM MODEL BASED ON MULTI-AGENT
The fleet CBM process involves a huge amount of communication among aircrafts, maintenance teams and the maintenance center.Moreover, maintenance teams and the maintenance center need to react to dynamic situations to make maintenance decisions and solve the problem, thus it can be regarded as a complex system (Zhang & Li, 2010), and one promising solution to complex systems is MAS.In MAS, an agent can be regarded as a self-directed software object with its own value system and a means to communicate with other agents (Baker, 1998), while the whole MAS can be regarded as "a loosely coupled network of problem solvers that work together to solve problems that are beyond the individual capabilities or knowledge of each problem solver" (Durfee, 1988).The fleet CBM process can be mapped into a similar MAS, where CBM strategies can be obtained via agents themselves and the communication between agents.

Model Framework
Through the analysis of the fleet CBM process, the physical entities can be abstracted into two types of agents, namely the Aircraft Agent (AA) and the Maintenance Agent (MA), and the dynamic process of management and coordination is abstracted into the Management and Coordination Agent (MCA).
AA is the abstract of an aircraft, it describes the inherent characteristics, the reliability characteristics, and is responsible for generating maintenance requirements.MA is the abstract of maintenance teams, and is responsible for specific maintenance process.
MCA is the abstract of the whole process of scheduling and intelligent allocating of maintenance tasks, it is driven by events, and is responsible for adjusting the whole process of maintenance, and obtaining the fleet maintenance strategy.
A 2-layer structure of MAS (Feng, Zeng & Kang, 2010) is applied to model the problem, each layer indicating the global scheduling and local scheduling, as shown in Figure 2. Local Scheduling is conducted between AAs and MAs, aimed at the negotiation in specific maintenance tasks.

Heuristic Rule-based Agent Negotiating Mechanism
The Contract Net Protocol (CNP) (Smith, 1980) is one of the most widely used agent negotiating mechanisms.Through imitating the "Calling-Bidding-Winning-Signing" process in economic behavior, CNP realizes the allocation, dynamic adjusting and converting of tasks among agents (Tang, Zhu, Li & Lei, 2010).Based on the CNP, the rationalization of the fleet CBM strategies is achieved.
In this article, all agents are assumed rational and friendly, their communication manifest cooperation and conflicts, which means that an agent is willing to cooperate with other agents, and maximize its own profit if possible.That assumption caters for practical situations.For instance, each aircraft wishes to be repaired as early as possible.A maintenance team needs cooperation to repair all aircrafts, but wishes to repair as many aircrafts as possible.
Since the MAS model applies the 2-layer structure, the negotiating between agents is also divided into two layers, namely the Host Negotiating and the Independent Negotiating.As proved above, the problem of a fleet maintenance with multiple maintenance teams is N-P hard, it's difficult to obtain the satisfying solution.So in each layer, negotiation must follow its corresponding heuristic rules, as described below.

Heuristic Rules in Independent Negotiation
In Independent Negotiation, idle MAs communicate with AAs to obtain local maintenance strategies, the alternative maintenance decision making heuristic rules are listed below.
1. Aircrafts in the required maintenance state S 3  The shortest total waiting time principle: all aircrafts in the required maintenance state S 3 are scheduled to shorten the average waiting time, or to even the working time of all maintenance teams.This rule is marked "Rule 11a".
 The most repairs within limited interval principle: once a maintenance team is idle, a maintenance task is performed on the aircraft with the shortest MMT.This rule is marked "Rule 11b".
 Single team with widest repair time margin principle: as many aircrafts are repaired by as few maintenance teams as possible, so as to leave the most teams idle, in case unexpected failures occur.This rule is marked "Rule 11c".2. Aircrafts in the opportunistic maintenance state S 2  The most repairs within limited interval principle: once a maintenance team is idle, a maintenance task is performed on the aircraft with the shortest MMT.This rule is marked "Rule 12a".

Heuristic Rules in Host Negotiation
In Host Negotiation, the MCA communicates with AAs to obtain global maintenance strategies, generates a group of local maintenance tasks and dispatches tasks to corresponding MAs.The whole process is listed below: Assume that the number of aircrafts needed for the upcoming mission is l n .

Agent Behavior in fleet CBM
Based on the analysis of the process of fleet CBM, the MAS model framework and the heuristic rules on solving maintenance strategies, the Agent Ability Chart (Feng, 2009) in fleet CBM is established, which finally defines agents' attributes and behaviors of function & fault, laying the foundation of solving the maintenance strategies.Since the CBM model involves communication between and within layers, the problem is relatively complex.As space is limited, three of the most typical maintenance schemes are illustrated.These three corresponding algorithms are listed below.

The Shortest Total Maintenance Waiting Time Maintenance Scheme Negotiating Algorithm
This scheme is relatively integrated, which involves cooperative and competitive negotiations.The algorithm is listed below.

Cooperative Negotiation
Cooperative negotiation is required before a maintenance task starts.It's aimed at calculating the whole maintenance time needed and allocating each MA its corresponding maintenance time.

Figure 5. The cooperative negotiation mechanism
Step 1: The negotiation initiator calling for bids.
The first idle MA i (a random MA if there exists more than one) calls other MAs and all AAs for bids ( , ) where t i represents the latest bid time allowed, ta i represents the earliest idle time of other MAs (Time to finish current task), tb i represents the maintenance duration needed.
MAs and AAs assess their own status and counter-bid before t i .The counter-bids from MAs are represented as () j j j EB t ta , where t j represents the waiting time, ta j represents the earliest idle time.While the counter-bids from AAs are represented as () EB t tb , where t k represents the waiting time, ta k represents the maintenance time needed.
Step 3: The negotiation initiator responding to all counterbids The negotiation initiating MA counts all counter-bids.Assume that m is the number of counter-bids from MAs and n is the number of counter-bids from AAs.Then based on the Shortest Total Maintenance Waiting Time Principle, The negotiation initiating MA calculates the Allocated Maintenance Time (AMT) to MA j through function Evaluate_EB(), and responds to each MA its AMT.

Competitive Negotiation
Competitive negotiation is required during the process of specific maintenance tasks.It's aimed at confirming maintenance strategy and realizing maintenance tasks.Step 1: MA calling for bids.
The first idle MA i (a random MA if there exists more than one) calls all AAs for bids () PR T AMT , where T i represents the latest bid time allowed, AMT i represents the allocated maintenance time.
AAs assess their own status through function Process_info().If it's within the candidate queue, then counter-bid before t i .The counter-bids from AAs are represented as () PR T MMT , where T j represents the waiting time, MMT j represents the maintenance time needed.
Step 3: MA assessing all counter-bids MA counts all counter-bids and assesses them through function Evaluate_EB(), ranking all counter-bidding AAs according to the length of MMT and selecting the candidate a with the closest MMT to AMT.
Step 4: MA judging whether to stop bidding.
MA updates its AMT: AMT temp =AMT-MMT a for the moment.
If abs (AMT temp ) < abs (AMT), then MA updates the AMT=AMT temp and responds to the selected AA and the selected AA then dequeues, repeat Step1 ~ Step3.Else, MA stops the current process of bidding and starts repairing all selected AAs.
Step 5: Other MAs start bidding according to the idle time order (a random MA if there exists more than one), repeat Step1 ~ Step4.

The Most Repairs Within the Limited Interval Maintenance Scheme Negotiating Algorithm
Calling Bids:PR i () Step 1: MA calling for bids.
The first idle MA i (a random MA if there exists more than one) calls all AAs for bids () PR T LMT , where T i represents the latest bid time allowed, LMT i represents the longest maintenance time.
AAs assess their own status through function Process_info().If it's within the candidate queue, then counter-bid before t i .The counter-bids from AAs are represented as ) ( j j j

MMT T PR
, where T j represents the waiting time, MMT j represents the maintenance time needed.
Step 3: MA assessing all counter-bids MA counts all counter-bids and assesses them through function Evaluate_EB(), ranking all counter-bidding AAs according to the length of MMT and selecting the candidate a with the shortest MMT.Then repair task starts, when task finishes, MA updates its LMT=LMT-MMT a .
Step 4: The repaired AA then dequeues.Other MAs start bidding according to the idle time order (a random MA if there exists more than one), repeat Step1 ~ Step3.

Single Team with Widest Repair Time Margin
Maintenance Scheme Negotiating Algorithm Step 1: MA calling for bids.
The first idle MA i (a random MA if there exists more than one) calls all AAs for bids () PR T LMT , where T i represents the latest bid time allowed, LMT i represents the longest maintenance time.
AAs assess their own status through function Process_info().If it's within the candidate queue, then counter-bid before t i .The counter-bids from AAs are represented as () PR T MMT , where T j represents the waiting time, MMT j represents the maintenance time needed.
Step 3: MA assessing all counter-bids MA counts all counter-bids and assesses them through function Evaluate_EB(), ranking all counter-bidding AAs according to the length of MMT and selecting the candidate a with the closest MMT to LMT.Then the selected AA dequeues.
Step 4: MA updates its LMT=LMT-MMT a and repeats Step1 ~ Step3, till there's no suitable candidate.Then stop bidding and start repairing all selected AAs.
Step 5: Other MAs start bidding according to the idle time order (a random MA if there exists more than one), repeat Step1 ~ Step4.

CASE STUDY
A typical continuous mission of a fleet is presented to verify the proposed fleet CBM decision making strategy.Assume a fleet consisting of 10 aircrafts, each monitoring the condition of two LRMs and predicting their corresponding RULs, which carries on a 3-wave mission.The time property of the mission is listed in Table 1 No Traditional CBM methods, which concentrates more on "timely" maintenance decision making rather than "online", can hardly make maintenance decisions online, so is not comparable with the MAS method.To make the comparison possible, MAS is applied to model traditional CBM policy, which assumes that an aircraft is repaired only when it comes to the required maintenance state S 3 , relies on a single threshold, and ignores the states of the whole fleet and the maintainability of limited teams.Assume that the initial state, mission time property and maintainability of teams are the same, and the fleet maintenance strategies are listed in The table shows that before the 3rd wave, aircraft 3,4,6,8 all need repairing, and the total time required is 3h, which exceeds the maximum time teams can offer, so mission fails.
The case above shows that the 2-thresholds CBM policy is superior to traditional single-threshold CBM policy in both flexibility and results.

CONCLUSION
In this paper, a fleet CBM intelligent decision making method based on MAS and heuristic rules is proposed, which is a technical support for fleet online maintenance decision making, and can help design a fleet maintenance Decision Support System (DSS).A fleet consisting of 10 aircrafts and 2 teams is illustrated to verify the correctness and feasibility of this method.
To avoid the local optimal solution, host negotiating is proposed to coordinate the global maintenance strategies, which can not only guarantee the correctness and feasibility of the solution, but also optimize the global maintenance strategy.
A 2-thresholds CBM policy is proposed, and results show that the 2-thresholds CBM policy is superior to traditional single-threshold CBM policy in both flexibility and results, while the requirement to decide maintenance threshold is much higher.
This method mainly concentrates on the strategy itself.With suitable improvement, this method can be modified to optimize maintenance resources.
This method is based on an assumption that the RUL estimation is accurate, and the maintenance strategies are based on accurate RULs.Considering the defects in failure prognostics technology, further study needs to discuss the relationship between the accuracy of the RUL estimates and the availability of the fleet, where PHM uncertainty management will be considered.

Figure 1 .
Figure 1.Maintenance thresholds and aircraft states Figure 2. fleet CBM MAS model frameworkGlobal Scheduling is conducted by MCA.When MCA receive the reports from AAs, it coordinates and controls the whole process and generates the overall maintenance strategy, to globally rationalize maintenance resources.

Figure 6 .
Figure 6.The competitive negotiation mechanism

Figure 7 .
Figure 7.The most repairs within the limited interval maintenance scheme negotiation mechanism Single team with widest repair time margin maintenance scheme negotiation mechanism AAs first report their health states S t to the MCA.The MCA analyses all data reported and confirms the number of AAs in the required maintenance state S 3 m 3 , the number of AAs in the opportunistic maintenance state S 2 m 2 , and the number of AAs in the no maintenance state S 1 m 1 .The MCA then calculates the number of repairable aircrafts within the interval m 4 according to Rule 11a, Rule 11b and Rule 11c respectively, and gets the maximum number m 4 , and the optimal rule is expressed as Pro(Rul i ).The number of combat-ready AAs m a =m 1 +m 2 +m 4 .The alternative maintenance decision making heuristic rules are shown in Figure3.AAs in the no maintenance state S 1 satisfies m 1 ≥ l n , then AAs in the no maintenance state S 1 are put on mission first, and AAs in the required maintenance state S 3 are repaired according to Rule 11a.When current task finishes, AAs in the opportunistic maintenance state S 2 are repaired according to Rule 12a, where AAs with the shortest MMT are repaired with high priority.This rule is marked "Rule 22".3.If m 1 < l n ≤ m 1 +m 2 , then AAs in the required maintenance state S 3 are repaired according to Rule 11b.This rule is marked "Rule 23".4.If m 1 +m 2 < l n ≤ m 1 +m 2 +m 4 , then AAs in the no maintenance state S 1 and the opportunistic maintenance state S 2 are put on mission first, and AAs in the required maintenance state S 3 are repaired according to Pro(Rul i ).This rule is marked "Rule 24". 5.When the interval ends, each aircraft checks its health state again, and reports to the MCA.Then the MCA analyses the reported data and select l n AAs with the shortest RUL out of all combat-ready AAs (AAs in the opportunistic maintenance state S 2 , AAs in the no maintenance state S 1 and repaired AAs) to execute the mission.This rule is marked "Rule 25".6.When mission starts, if there exists still AAs in the opportunistic maintenance state S 2 required maintenance state S 3 among all the left-over AAs, then those AAs are repaired according to Rule 11a and Rule 12a respectively.This rule is marked "Rule 26".

Table 1 .
Time property of the missionDuring the mission, there exist two maintenance teams to support the whole fleet.The maintainability (MMT) of the two LRMs are listed in Table2

Table 2 .
Reliability and maintainability dataTo effectively verify this method, assume that some aircrafts are in the no maintenance state S 1 while others are in the opportunistic maintenance state S 2 , hence all aircrafts can take part in the first mission, and no maintenance is considered before the first wave.The initial RUL of all aircrafts in the fleet are listed in Table3

Table 3 .
The initial RUL of the fleet Since the future missions are unknown, the maintenance thresholds can be decided as: τ is the time before the aircraft return from the next mission, and T is

Table 6 .
Single threshold fleet maintenance strategies This work is partially supported by the Fundamental Research Funds for the Central Universities of China (No. YWF-12-LSJC-001). the Beijing Institute of Mechanical Industry.His current research interests include prognostics and health management, physics of failure, reliability of electronics, reliability engineering, and integrated design of product reliability and performance.He has won a 1st and a 3rd prize for National Defense Science and Technology Progress Award.He has published over 40 papers and 2 book chapters (Reliability Design and Analysis, and Diagnostics, Prognostics, and System's Health Management).He is now a member of the Editorial Board