The Study of Trends in AI Applications for Vehicle Maintenance Through Keyword Co-occurrence Network Analysis

The increasing complexity of a vehicle's digital architecture has created new opportunities to revolutionize the maintenance paradigm. The Artificial Intelligence (AI) assisted maintenance system is a promising solution to enhance efficiency and reduce costs. This review paper studies the research trends in AI-assisted vehicle maintenance via keyword co-occurrence network (KCN) analysis. The KCN methodology is applied to systematically analyze the keywords extracted from 3153 peer-reviewed papers published between 2011 and 2022. The network metrics and trend analysis uncovered important knowledge components and structure of the research field covering AI applications for vehicle maintenance. The emerging and declining research trends in AI models and vehicle maintenance application scenarios were identified through trend visualizations. In summary, this review paper provides a comprehensive high-level overview of AI-assisted vehicle maintenance. It serves as a valuable resource for researchers and practitioners in the automotive industry. This paper also highlights potential research opportunities, limitations, and challenges related to AI-assisted vehicle maintenance.


INTRODUCTION
The COVID-19 pandemic had a profound impact on the global automotive industry, notably causing disruptions in the supply chain and chip shortages that significantly slowed automobile production and narrowed profit margins (Coffin et al., 2022).Amid such global disruptions, the Industrial IoT (IIoT) plays a crucial role in sustaining automotive production (Agrawal et al., 2020).Building on this technology, many within the sector prioritized remote monitoring and health forecasting as key strategies to navigate the challenges (Umair et al., 2021).Furthermore, while the trend of generating revenues throughout a vehicle's lifecycle started before the pandemic, it has been accentuated during the COVID disruption (Singh, 2020).This heightened attention to the vehicle lifecycle, coupled with advancements in technology, necessitates a reevaluation of traditional vehicle maintenance paradigms.
Vehicle maintenance is an integral part of the vehicle lifecycle.As the digital architecture of a vehicle grows sophisticated and intelligent, new opportunities have emerged, shifting the maintenance paradigm.A general maintenance strategy usually takes one of four forms (Coleman et al., 2017).The simplest form of vehicle care is reactive maintenance, in which a vehicle is fixed only when it fails.Unpredictable failure usually causes variation in vehicle downtime.Currently, the common form of vehicle care is scheduled maintenance, in which maintenance activities are performed at pre-determined intervals, regardless of the condition of the vehicle or equipment.Scheduled maintenance brings additional inspection costs but reduces unexpected failures during operation.
Figure 1.Attributes of four types of maintenance strategies Advanced vehicles are equipped with features that enable predictive maintenance, which allows maintenance activities to be performed as needed based on the current condition of the vehicle or equipment.This form of maintenance typically relies on sensing technologies and failure modeling.While it requires a higher sensing cost, it can reduce unexpected vehicle breakdowns and avoid possible over or undermaintenance experienced in scheduled maintenance.On the other hand, proactive maintenance not only anticipates but actively identifies potential issues, addressing their root causes rather than just symptoms for a more thorough prevention.Proactive maintenance typically requires more advanced hardware and software and can be more complex than other maintenance forms.However, its intelligence and self-correcting capability can greatly increase vehicle reliability, leading to improved performance and decreased downtime and maintenance costs.Figure 1 identifies the attributes of the four maintenance forms.There is a trade-off between the maintenance system complexity and the system downtime.
The integrated vehicle health management (IVHM) system is a maintenance architecture for ground vehicles, aircraft, and railways.Originally developed by NASA for aerospace applications, the IVHM system has been adapted for use in ground vehicles and marine transportation.A typical IVHM system uses sensor inputs to evaluate and forecast the health status of the vehicle (Esperon-Miguez et al., 2013).In 2018, the Society of Automotive Engineers (SAE) published the JA6268 standard for "Design & Run-Time Information Exchange for Health-Ready Components" to promote industry collaboration and encourage the adoption of upgraded maintenance strategies in the automotive sector (Felke et al., 2017).This standard defined six IVHM system capability levels from "no intelligence" to "self-adaptive" capabilities (presented in Figure 2) (SAE JA6268, 2023).

Figure 2. SAE-defined IVHM capability levels for automotive applications
The IVHM capability levels can be mapped to the maintenance forms described earlier.The attributes of maintenance categories also apply to the IVHM capability levels.Vehicles manufactured since the early 1980s are equipped with dashboard indicators, which constitute a level 0 health management system.These indicators alert the operator when reactive maintenance is required.The introduction of microprocessor-based controls and on-board diagnostic (OBD) systems adopted between 1980 and 1995 allowed vehicles to support level 1 diagnostic scanning tools through electric ports.The use of scan tools improved the efficiency of scheduled inspection and maintenance.The deployment of the GM OnStar telematic system enabled level 2 real-time vehicle data transmission (Yilu Zhang et al., 2009).However, by design, a level 2 system does not involve modeling, so it can only provide remote support center advice during reactive or scheduled maintenance.A level 3 system is characterized by its ability to perform diagnostic and prognostic modeling at the component level.The addition of vehicle-level diagnostic and prognostic modeling to a level 3 upgrades the system to a level 4 system.These modeling techniques transform data from various sensors into information and knowledge, enabling predictive maintenance.Finally, a level 5 system achieves proactive maintenance by leveraging vehicle control feedback from the diagnostic and prognostic mechanisms.A Level 5 system extends vehicle operational-availability and enhances vehicle safety(SAE JA6268, 2023).While the design and configuration of diagnostic and prognostic models may differ across various IVHM capability levels, they all rely on data obtained from vehicle sensors, physical measurements, and system logs to operate effectively.Figure 3 shows some key sensors in a typical commercial vehicle.These sensors generate signals in a time series format.The sensor data is accessible through network interfaces such as Controller Area Network (CAN) bus (Avatefipour & Malik, 2018).With increasing IVHM capability level, the data dimension grows, and the complex interdependencies between signals become challenging for diagnostic and prognostic modeling.
Artificial intelligence (AI) techniques have demonstrated superior performance in high-dimensional signal processing and sensor fusion.In recent years, many studies have explored the use AI-assisted systems to diagnose faults in real-time and predict maintenance needs across various industries (Carvalho et al., 2019;Lo et al., 2019).This review will primarily focus on data-driven approaches, which rely on machine learning algorithms as the key enablers.
Typically, a machine-learning-based predictive maintenance method take one of the two workflows.The first workflow option involves a separate feature engineering step to format the data according to the input requirements of machine learning algorithms.This step includes data preprocessing, signal transformation, feature selection, and dimension reduction, all of which contributes to the extraction of valuable information from raw signals or images (Y.Hu et al., 2022).However, vehicle sensor data, such as waveform signals and images, is usually high-dimensional, heterogeneous, and multimodal.These characteristics usually lead to highly complex and nonlinear relationships among variables associated with vehicle failures.The second workflow option is using end-to-end deep learning models that have embedded sensor fusion and feature extraction capacities (LeCun et al., 2015).These models take raw sensor data as input obviating the need for manual feature engineering.While they have advantages over the first option, the deep learning models also have some drawbacks such as computational complexity, robustness to changing conditions, interpretability, and reliability in real-life applications.
Existing review papers that are relevant to this work mostly focus on a subtopic of machine learning and prognostics and health management (PHM) methodologies.For example, The motivation of this keyword co-occurrence-based review paper is to reveal the trends in the AI applications for vehicle maintenance and understand purpose and implementation environments of key AI applications.The research trends will inform promising directions for future research in this area.The remainder of this paper is structured as follows: Section 2 describes the methodology used to construct the keyword co-occurrence networks and analyze the data.Section 3 presents the results of our analysis, including the most frequently occurring keywords and their relationships, the evolution of the research field, key research topics and trends, and key applications.Section 4 provides a discussion of the real-world deployment scenarios of the reviewed applications and potential future research directions.Section 5 concludes the paper with a summary of the main findings and their implications for the field of AI applications for vehicle maintenance.

METHODOLOGY
This work aims to explore the trends of AI systems for vehicle maintenance.A conventional literature review usually systematically evaluates existing publications on a specific topic.However, identifying the most relevant studies and summarizing the research trends becomes cumbersome when the subject is broad and interdisciplinary.In this work, we adopt the keyword co-occurrence network (KCN) analysis as an effective way to quantify the research trends and identify the most relevant and up-to-date publications on AI systems for vehicle maintenance.This section explains the article collection process, keyword co-occurrence network construction, and network analysis methods.

Article Collection and Network Construction
We queried the Engineering Village and the IEEE database for relevant articles.Figure 4 presents the keyword criteria for the article collection.To be included in the review, an article's metadata, including title, keywords, and abstract, must contain at least one keyword from each of the first three boxes from the left and exclude any keyword from the fourth box.We also limited the search to peer-reviewed journal articles and conference proceedings published in English from 2011 to 2022.After removing duplicates and articles with no author-defined keywords, we identified a final set of 3153 articles.The next step is constructing a database of author-specified keywords from the selected articles.To minimize the impact of authors' language habits on the analysis, we reconciled the keywords with pipelines constructed with natural language processing toolkits in Python (Ozek et al., 2022).The preprocessing pipeline performs the following operations: (1) tokenization: breaking the strings of keywords into individual phrases, (2) inconsequential-word removal: removing commonly used words such as "the" and "a" that do not add significant meaning to the keywords, (3) stemming and lemmatization: reducing words to their base form, such as converting "clustering" and "clusters" to "cluster," (4) synonym and acronym merger: using domain knowledge to merge words with identical meanings, such as merging "CNN" and "convolution neural network."Each keyword is also categorized as "application" or "model" related for future analysis.Using these values, we built the KCN in which nodes stand for keywords, edges represent co-occurrences, and the edge weights indicate co-occurrence frequencies.Figure 5 shows an example KCN built with keywords from two articles.As new keywords from Article 2 were introduced, the KCN expanded with new nodes and edges.The weight of an edge indicates the co-occurrence count.For instance, the link between "machine learning" and "deep learning" became stronger because these keywords co-occurred in Articles 1 and 2. The keyword nodes are color-coded based on their label, making it easier to visually distinguish which machine learning model is used in which application.

Network Analysis Parameters
Network parameters provide quantitative insights into the structure and relationships of the keywords in the KCN.This section outlines the parameters used to analyze the KCN and interprets each parameter in the context of literature review.
Degree refers to the total number of links connecting a single node (node ) to other nodes in the network.It measures the relative importance of a node compared to other nodes in this network.The degree of a node  is defined as follows: where  denotes the set of nodes connected to node  in this network.The indicator variable   takes the value of 1 when node  and  are connected and 0 when there is no connection between them.In the context of our KCN, the degree of a keyword is the total number of unique keywords with which it co-occurs.The more often scholars include a keyword in their studies, the more connections this keyword will potentially have to other keywords.Therefore, the degree partially reflects a keyword's popularity within the research field.
Strength is the sum of the weights of all links connected to node .It is a measure of a node's importance in a weighted network.The strength of a node  is defined as follows: where   is the weight of the edge   .The edge weight in the KCN represents the number of times a pair of keywords (represented by nodes  and ) co-occurs.The degree of a keyword measures the number of unique co-occurrences it has, while the strength of a keyword measures the total number of co-occurrences, not necessarily unique, it has.In the context of a KCN, both the degree and the strength of a keyword (node) reflects its centrality and influence within the research field.Keywords with higher degree are more likely to be associated with multiple research topics, while keywords with higher strength are likely to be more popular indicating their reference frequently.
Average weight as a function of endpoint degree examines the relationship between the degree of a keyword and the average weight of its connections.Given nodes  and , we visualize the relationship by plotting the weight   against the product of   and   .However, different combinations of degrees can lead to the same product of degrees, for example, 2 × 50 = 10 × 10 = 100.To capture the general trend, we take the average weights of all edges with the same endpoint degree (product of end node degrees).The relationship is defined as follows: where <•> denotes the average operation.In the KCN context, if there is a positive correlation between average weight and endpoint degree, we conclude that popular keywords tend to have heavier weight connection with other keywords.On the contrary, a negative correlation indicates that less popular keywords tend to have heavier connections.This relationship provides insights into the structure and dynamics of the research field.
Average weighted nearest neighbor's degree measures the average number of links that a node's closest neighbors have while considering the strength of those links.The average weighted nearest neighbor's degree of a node  is defined as follows: To reveal the general characteristics of the neighbor's degree, we plot    against the keyword degree   .In the KCN context, if    increases with increasing keyword degree   , we conclude that high-degree keywords tend to connect with other high-degree keywords.
Weighted clustering coefficient measures the extent to which the neighbors of a node are interconnected.It reflects the local cohesiveness of a node among its neighbors.In this work, the weighted clustering coefficient of a node  is defined as follows (Onnela et al., 2005): where   is the set of nodes connected to node , and  �  ,  �  , and  �  , are the weight   ,   , and   normalized by dividing them by the maximum weight in the network.
is the intensity of the triangle subgraph connection ,  , and  .The interconnectivity of node  's neighborhood is calculated by summing up the intensity of all triangle subgraphs that involve node  .Finally, the summation of subgraph intensity is normalized between 0 and 1.A value of 1 indicates that all the neighbors of node  are fully connected with the maximum network weight, while a value of 0 indicates that none of the neighbors of node  are connected.In the KCN context, a keyword with a high weighted clustering coefficient indicates a strong local connectivity and cohesiveness.

RESULTS
Our search criteria yielded 3153 papers published between 2011 and 2022.To facilitate trend analysis, we divided the 12-year span into four 3-year windows and built a KCN for each time window.In this section, we present the network metrics from each time window and comment on the evolution of the research field.We then visualize the research trends in applications and models over the years.Finally, we present the co-occurrence matrix of the top 10 applications and models from each time window.These findings provide valuable insights into the trends of AI models for vehicle maintenance and identify the key applications.

Network Metrics
Table 1 and Figure 6 show that the number of articles, keywords, and co-occurrences in this research field has increased steadily over the years.Compared to the initial 2011-2013 period, the number of articles, keywords, and cooccurrences in the most recent 2020-2022 period has grown by a factor of 4.71, 3.68, and 3.26, respectively.This overall trend suggests that the research field is expanding as more and more researchers are attracted to the field exploring new AI models for vehicle maintenance applications.Although the growth rate of keywords and co-occurrences is lower than that of articles, this trend suggests that some research topics are gaining depth over time.In addition to exploring the application of new AI models to new vehicle maintenance tasks, researchers are gaining a more nuanced and multifaceted understanding of specific models and applications.
Table 1 and Figure  The steady growth of maximum strength, degree, and weight suggests the emergence of highly connected keywords in the AI application for the vehicle maintenance field.Figure 7 supports this finding as the degree, strength, and weight distribution in all time windows is right skewed with numerous outliers having large values.These network metrics help us identify such highly connected keywords and topics, which we will discuss in the following subsection.

Research Topics Trends
Although network metrics provide valuable insights into the development of the research landscape, it is crucial to examine the trends at a topic level to understand the direction of the research interests.In this section, we will delve into the trends of specific research topics to uncover emerging trends and research hotspots.• There is an increasing interest in applying AI systems to electric and autonomous vehicles.This trend can be attributed to the fact that modern electric and autonomous vehicles are usually equipped with various sensors and advanced perception system that rely on AI technologies.In contrast, integrating AI systems into traditional gasoline cars can be challenging, as they may not have the necessary sensors or computational power to support such systems.• Applying AI systems to vehicle maintenance is wellaligned with the Internet of Things (IoT) scope.Various existing frameworks and technologies from other IoT research, such as data management, cloud computing, and edge computing, can be adapted to vehicle maintenance applications.However, there is a gap between experimental success and practical application.
It is important to study the impact of AI-based vehicle maintenance systems on factors such as maintenance costs, power demand, and overall system reliability in real-world scenarios.

Frequently Co-occurring Models and Applications
This paper aims to identify popular applications of AI systems for vehicle maintenance and examine their implementation environments.To achieve this objective, we created a keyword co-occurrence matrix (Figure 13) that displays the top 10 most frequently occurring AI models (Xaxis) and applications (Y-axis) over the years.The matrix uses a color-coding scheme to represent the co-occurrence frequency of keyword pairs, with darker blue indicating a higher frequency of occurrence., 2006).Researchers have applied SVM to achieve fault diagnosis on both the component and vehicle levels.The targeted fault events include the shift solenoid and speed sensor fault in an automatic transmission system (Du et al., 2019), the malfunction of the lubrication system in a diesel engine (Y.Wang et al., 2016), the degradation of the torque converter clutch (Jia et al., 2019), loosened connectors from light assemblies (W.Hu et al., 2013), the DC serial arc fault in an EV power system (Xia et al., 2019), the single cylinder misfire fault (Xu et al., 2018), the lithium-ion battery aging (Y.Li et al., 2022), the abnormal increase in friction and malfunction in the electric power steering system (Ghimire et al., 2018), and the open switch fault in an EV inverter (Mwangi et al., 2022).
Figure 14.In-situ test bench of a vehicle cylinder (Xu et al., 2018) Training data come from three main sources: computer simulation, lab setups, and real-life vehicle experiments.For simulation-based approaches, researchers can either develop a theoretical model of a vehicle part and simulate its dynamics using tools such as MATLAB Simulink (Ghimire et al., 2018;Mwangi et al., 2022), or use specialized vehicle simulators like VE-DYNA (Nieto González, 2018).In a lab setup, the target vehicle part is placed on a test bench and fitted with additional sensors to capture data during controlled experiments.The sensor data collected often goes through signal processing for feature engineering, then the resulting features are used to train the SVM models.(Haque et al., 2018), battery thermal runaway faults (D.Li et al., 2022), and lithium-ion battery degradation (Cui et al., 2020;Ke et al., 2021; S. W. Kim et al., 2022).In addition to vehicle part failure diagnosis, deep learning models are also commonly used to diagnose faults on the vehicle and fleet level.Ribeiro et al., 2016).In Figure 15, the LIME technique is used to explain the output of a deep sequential neural network with 9 layers.The model predicted the instance as error type G2029043, and LIME presented the contribution of key features to the prediction.SHAP (SHapley Additive exPlanations) is another model-agnostic method for interpreting the predictions of machine learning models (Lundberg & Lee, 2017).The SHAP method produces a set of explanations for each prediction, with each explanation representing the contribution of a particular feature to the prediction.The explanations can be used to gain insights into the model's behavior and to identify which features are driving the predictions.Compress the input data to a latent space that serves as a basis to build a self-explanatory map Aircraft engines remain useful life estimation AGCNN (Liu et al., 2021) Raw signal after sliding window processing A feature-attention based bidirectional GRU CNN model Turbofan engine remain useful life estimation mFG-CAM(M.S. Kim et al., 2022) Frequency-domain raw signal

Frequency-Domain-Based Gradient Class Activation Mapping
Bearing Fault diagnosis Explainable -GWDN(T.Li, Sun, et al., 2022) Graph data transform time-domain signal The graph wavelet denoising convolution is proposed based on the discrete graph wavelet frame to achieve multiscale feature extraction.

Rolling Bearing diagnosis
Table 3 summarizes the most recent works related to interpretable deep learning methods in diagnosis domain.In summary, in current PHM research domain, the interpretable methods mainly focus on how one can integrate traditional signal processing technology into deep learning model to let users interpret the prediction from raw sensor signal, like feature importance estimation.

DISCUSSION
Between 2011 and 2022, the number of publications focused on AI applications for vehicle maintenance experienced a nearly fivefold increase.To provide context for this growth, we also examined the general field of vehicle maintenance research, excluding the constraint of 'machine learning'.Using IEEE Xplore and Engineering Village for this analysis, we found that the number of articles increased approximately threefold from 2011 to 2022.This comparative data further emphasizes the growing interest in incorporating AI into vehicle maintenance.As a result, this evolving focus has led to a rapidly expanding and decentralized research network over time.By analyzing the emerging and declining keywords, we found that the problem scope for modeling has shifted from binary fault classification to challenging tasks such as vehicle component remaining useful life (RUL) prediction and vehicle system level maintenance action recommender.Classic machine learning models have matured, and deep learning and ensemble learning models have come into the spotlight of recent studies.In terms of application trends, there has been a surge in the research on AI systems for electric and autonomous vehicles, as these vehicles often come equipped with modern sensors and computational systems.The development of IoT technologies in other sectors has also inspired researchers in the automotive industry.However, limited efforts have been made to study the real-world deployment of IoT technologies and associated issues, such as cost, reliability, and cybersecurity.
A real-life vehicle maintenance AI system development lifecycle (Iyengar & Portilla, 2022) is presented in Figure 16.During each stage of the lifecycle, a variety of unique challenges arise from the nature of the vehicle system.Yet, researchers and practitioners can leverage knowledge and best practices from other industries, such as aerospace and manufacturing, to effectively address these challenges.To that end, we provide the following prescriptive suggestions and resources, organized by each stage of the AI system development lifecycle.

Design the AI System
The design of an AI system for vehicle maintenance requires careful consideration of the availability and modality of vehicle sensors.However, a trade-off often emerges between a clean-sheet design, where new sensing and communication systems are installed primarily for health management purposes, and a retrofit design, where existing sensing and communication structures are modified to enhance the health management capacity.According to the Aerospace recommended practice ARP6407 (SAE ARP6407, 2019), while retrofitting is the prevalent practice for aerospace and ground vehicle applications, it presents challenges such as the significant cost of system recertification, compromised data quality from existing sensors not primarily meant for the intended purpose, and reduced inherent system reliability.
Figure 16.The AI-system development lifecycle with the parties involved and key steps Train the AI Model To effectively train an AI model, the evaluation process must accurately reflect its performance in real-world applications.This requires specialized performance evaluation metrics that go beyond general metrics like the failure detection rate.Aslanpour et al. (2020) analyzed various real-world metrics to evaluate the performance of cloud, fog, and edge computing paradigms, which can be adapted to the vehicle maintenance scenario.They noted that metrics like fault detection, cost/profit, resource utilization, delay/latency time, and scalability become more critical as edge models evolve from private to federated.Additionally, system throughput, number of orchestration decisions, and energy consumption are equally important for all models.
Build the AI Service The deployment of the trained AI service requires an edge computing platform that seamlessly integrates into the vehicle's hardware and software systems.However, the mobility of vehicles presents unique challenges such as network instability or complete disconnection.In addition, the AI system must be designed to operate within the constraints of limited computational resources, intermittent power supply, and harsh environmental conditions.To address these challenges, it is valuable to leverage the insights from other AI systems deployed in vehicular environments.Tong et al. (2019) reviewed the state-of-the-art AI applications in the vehicle-to-everything (V2X) system, which mainly focus on enhancing traffic efficiency, road safety, and energy efficiency.Insights can be gained about communication technologies for vehicles, such as dedicated short-range communication (DSRC) and longterm evolution (LTE) cellular communication, by analyzing these V2X applications.

Publish the AI Service
The AI-assisted vehicle health management system relies on a sensor network.However, once the AI service is published, security becomes a major concern, as the system is vulnerable to potential cyber attacks on individual sensor devices, the edge computing device, and the communication network.It is important to detect when a component has been compromised.While extensive studies have been conducted on cyber attacks in sensor networks, the security frameworks developed for general-purpose computing systems cannot be directly migrated to vehicular AI systems due to different network topologies and communication protocols (Xiao et al., 2019).Therefore, specific security protocols need to be designed and tested for AI-assisted vehicle health management systems.
Deploy the AI Service Deploying an AI service requires a well-trained workforce that can operate the system and take vehicle maintenance actions accordingly.It is important to consider the unique skills and knowledge required to operate an AI-assisted vehicle health management system.For example, workers may need to be trained on how to interpret and analyze data generated by the system and how to use the system's interface to access information and control its functions.It is worth referring to research investigating workforce skill gaps given the rapidly digitalized working environment (Bühler et al., 2022;G. Li et al., 2021) and develop training programs and product manuals accordingly.

Update the AI Model
To ensure the continued effectiveness of the AI-assisted vehicle health management system, it is necessary to regularly update and maintain the AI model.This requires monitoring and detection of both sensor and model aging.To ensure seamless vehicle operation, it is essential to integrate the AI-system maintenance into the current scheme of vehicle maintenance.
Industry solutions are widely available at each step of this lifecycle.Commercial edge computing platforms, such as IBM Edge Application Manager (IBM, 2022), NVIDIA EGX platform (NVIDIA, n.d.), and Microsoft Azure IoT Edge (Azure, n.d.), provide high-performance edge computing and network hardware and streamlined data management and modeling software.Various industry collaboration programs, such as the automotive edge computing consortium (AECC) (AECC, n.d.) and health-ready components and systems (HRCS) charter, are also established to collaboratively develop AI solutions for vehicle maintenance.
While industry solutions provide practical tools and frameworks for implementing AI systems in the field, academic researchers are responsible for fundamental research that addresses issues at each step of the lifecycle.This includes analyzing AI-system integration, model deployment and maintenance, and evaluating trade-offs and interactions among stages.This holistic approach ensures the development of reliable, robust, and customized AI systems for vehicle maintenance applications.

CONCLUSION
This paper conducted a keyword co-occurrence analysis of the literature on AI systems for vehicle maintenance applications.We collected keywords from a total of 3153 peer-reviewed articles published between 2011 and 2022.The centrality, affinity, and cohesiveness of the keywords are examined to understand the knowledge structure and growth momentum of this research field.
We explored trends in AI systems for vehicle maintenance from different angles.We categorized keywords into the model and application groups and visualized the frequency of emerging and declining keywords.We also created cooccurrence matrices of the top 10 applications and models from each time window.The results revealed a shift in the problem setting for modeling, from binary fault classification to more challenging tasks such as vehicle component remaining useful life (RUL) prediction and vehicle system level maintenance action recommender.Classic machine learning models have matured, and deep learning and ensemble learning models have gained prominence in recent research.Moreover, researchers are increasingly focusing on developing AI systems for electric and autonomous vehicles, which come equipped with modern sensors and computational systems.We observe that IoT technologies are gaining attention in the automotive industry, but limited research has been conducted on the deployment of IoT technologies and associated issues, such as cost, reliability, and cybersecurity analysis.

Figure 4 .
Figure 4. Keyword criteria for quarrying relevant articles from published literature

Figure 5 .
Figure 5.The process of building a KCN with keywords from two articles After preprocessing the keywords to a standard format, we calculated their co-occurrence values, which indicate the frequency of two keywords appearing together in an article.Using these values, we built the KCN in which nodes stand for keywords, edges represent co-occurrences, and the edge weights indicate co-occurrence frequencies.Figure5shows an example KCN built with keywords from two articles.As new keywords from Article 2 were introduced, the KCN expanded with new nodes and edges.The weight of an edge indicates the co-occurrence count.For instance, the link between "machine learning" and "deep learning" became stronger because these keywords co-occurred in Articles 1 and 2. The keyword nodes are color-coded based on their label, making it easier to visually distinguish which machine learning model is used in which application.
7 presents the changes in average and maximum network strength, degree, and weight.The average strength and degree leaped from 2011-2013 to 2014-2016, decreased considerably in 2017-2019, and increased slightly in 2020-2022.The trend suggests that certain topics attracted research interests, and connections between such topics grew stronger from 2011-2013 to 2014-2016.Then the most rapid keyword increase occurred from 2014-2016 to 2017-2019, rapidly expanding the research landscape, resulting in a less interconnected network, and leading to a decrease in average strength and degree.In the most recent time window (2020-2022), the slight increase in average strength and degree indicates that certain topics are gaining more attention and forming stronger connections.

Figure 8 .
Figure 8.Average weight as a function of endpoint degree

Figure 9 .
Figure 9. Average weighted nearest neighbor's degree as a function of node degreeFigure9presents the relationship between the node degree and its average weighted neighbor's degree.Except the most recent time window (2020-2022), all other time windows show a slightly increasing trend.The most significant increase occurs in the 2014-2016 period.This trend suggests that nodes with high degrees are more likely to connect with other high-degree nodes, forming popular keyword hubs.In the most recent time window (2020-2022), no clear correlation is observed between high-degree nodes and their neighbors, suggesting that these nodes are more likely to connect with a diverse range of nodes rather than forming concentrated keyword hubs.

Figure 10 .
Figure 10.Weighted clustering coefficient as a function of node degree Figure 10 illustrates the relationship between the node degree and its weighted clustering coefficient for all time windows.The declining trend indicates that keywords with a small number of connections tend to form clusters with other lowdegree keywords, while popular keywords connect with both popular and less popular keywords.In recent years, the magnitude of the weighted clustering coefficient has decreased, and the declining trend has become more pronounced with time.It suggests that the research network

Figure 11 and
Figure11and Figure12show the changes in the frequency rank of keywords grouped by AI models and vehicle maintenance applications, respectively.Keywords are ranked by frequency occurrence in descending order, and the rank change is plotted as a slope connecting the rank from the initial time window 2011-2013 to the most recent time window 2020-2022.It is worth noting that a mild declining trend for a specific keyword doesn't necessarily indicate a decrease in its research significance.Instead, it may simply not be as prevalent as other emerging keywords while remaining crucial and relevant in the domain.

Figure 11 .
Figure 11.Emerging and declining keywords in the model category from 2011-2013 to 2020-2022

Figure 12 .
Figure 12.Emerging and declining keywords in the application category from 2011-2013 to 2020-2022 Figure 13.Keyword co-occurrence matrix of top 10 most frequent applications vs. models over the years Figure14, for instance, shows a lab setup for collecting data for cylinder misfire fault classification.While lab data provides controlled conditions, the closest approximation to real-world scenarios is obtained through experiments on vehicles (Y.Wang et al., 2016) Ren et al. (2019),Gültekin et al. (2022a), and K. Kim et al. (2020) exploited the data fusion benefit of deep neural networks and developed multisensory fault diagnosis systems for autonomous vehicles.Al-Zeyadi et al. (2020), Gherbi et al. (2022), and Gültekin et al. (2022b) applied the cloud computing structure from the Internet of Things technologies and constructed frameworks for fleet-level model training and deployment.

Figure 15 .
Figure 15.An example of LIME applied to vehicle diagnostic prediction (Al-Zeyadi et al., 2020) One of the drawbacks of deep learning models is their "blackbox" nature.To address this limitation, researchers have integrated model explainers such as Local Interpretable Model-agnostic Explanations (LIME) to the front-end user interface (Al-Zeyadi et al., 2020).Diagnostic explainers powered by LIME can explain the reasoning behind the diagnosis of a particular fault in a vehicle, providing transparency and improving trust in the model's outputs(Ribeiro et al., 2016).In Figure15, the LIME technique is used to explain the output of a deep sequential neural network with 9 layers.The model predicted the instance as error type G2029043, and LIME presented the contribution of key features to the prediction.SHAP (SHapley Additive exPlanations) is another model-agnostic method for interpreting the predictions of machine learning models(Lundberg & Lee, 2017).The SHAP method produces a set of explanations for each prediction, with each explanation representing the contribution of a particular feature to the prediction.The explanations can be used to gain insights into the model's behavior and to identify which features are driving the predictions.

Table 1 .
Network metrics of a KCN from four time periods Figure 6.Number of articles, keywords, and links over the four time-windows Figure 7. Node degree, strength, and link weight value distribution

Table 2 .
Recent work related to SVM-assistant diagnosis Table 2 summarizes recent work about SVM-assistant PHM research from 2020.There two main subfields in SVM-assistant diagnosis research.First, to handle the high-dimensional signal, recent work applied feature selection algorithms, such as Principal Component Analysis (PCA) and Random Forest, and Deep learning methods such as Auto-encoder to extraction features from raw sensor signal.Second, we need to solve the SVM training problem via a constrained optimization problem.The most common method used to solve this optimization problem is Sequential Minimal Optimization (SMO) algorithm.Table 2 summarizes the most recent research in these two subfields.

Table 3 .
Recent work related to Interpretable DL-assistant diagnosis