AI+AR based Framework for Guided Visual Diagnosis of Equipment
##plugins.themes.bootstrap3.article.main##
##plugins.themes.bootstrap3.article.sidebar##
Abstract
Automated solutions for effective support services, such as failure diagnosis and repair, are crucial to keep customer satisfaction and loyalty. However, providing consistent, high quality, and timely support is a difficult task. In practice, customer support usually requires technicians to perform onsite diagnosis, but service quality is often adversely affected by limited expert technicians, high turnover, and minimal automated tools. To address these challenges, we present a novel solution framework for aiding technicians in performing visual equipment diagnosis. We envision a workflow where the technician reports a failure and prompts the system to automatically generate a diagnostic plan that includes parts, areas of interest, and necessary tasks. The plan is used to guide the technician with augmented reality (AR), while a perception module analyzes and tracks the technician’s actions to recommend next steps. Our framework consists of three components: planning, tracking, and guiding. The planning component automates the creation of a diagnostic plan by querying a knowledge graph (KG). We propose to leverage Large Language Models (LLMs) for the construction of the KG to accelerate the extraction process of parts, tasks, and relations from manuals. The tracking component enhances 3D detections by using perception sensors with a 2D nested object detection model. Finally, the guiding component reduces process complexity for technicians by combining 2D models and AR interactions. To validate the framework, we performed multiple studies to:1) determine an effective prompt method for the LLM to construct the KG; 2) demonstrate benefits of our 2D nested object model combined with AR model.
How to Cite
##plugins.themes.bootstrap3.article.details##
Automated diagnostic systems, knowledge graphs, relation extraction, large language models, scene understanding, object detection, augmented reality
[2] Jang, J.; Shin, M.; Lim, S.; Park, J.; Kim, J.; Paik, J. Intelligent Image-Based Railway Inspection System Using Deep Learning-Based Object Detection and Weber Contrast-Based Image Comparison. Sensors 2019, 19, 4738.
[3] Gonzalez, Teresa. et. al Guided Visual Inspection enabled by AI-based Detection Models. (2021). 1-8. 10.1109/ICPHM51084.2021.9486573.
[4] Shalaby, W., Arantes, A., GonzalezDiaz, T.,; Gupta, C. (2020, June). Building chatbots from large scale domain-specific knowledge bases: Challenges and opportunities. In 2020 IEEE International Conference on Prognostics and Health Management (ICPHM) (pp. 1-8). IEEE.
[5] Cabot, P., Navigli, R. REBEL: Relation Extraction By End-to-end Language generation. In Findings of the Association for Computational Linguistics: EMNLP 2021
[6] Gilardi, F., Alizadeh, M., & Kubli, M. (2023). Chatgpt outperforms crowd-workers for text-annotation tasks. Proceedings of the National Academy of Sciences 2023
[7] Wang, C., Liu, X., Song, D. (2020). Language models are open knowledge graphs. arXiv preprint arXiv:2010.11967.
[8] Wadhwa S, Amir S, Wallace BC. Revisiting Relation Extraction in the era of Large Language Models. Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:15566-15589
[9] Wang, K., Lin, Y., Weissmann, B., Savva, M., Chang, A. Ritchie, D. (2019). PlanIT: planning and instantiating indoor scenes with relation graph and spatial prior networks. ACM Transactions on Graphics. 38. 1-15.
[10] Ha H, Song S. Semantic abstraction: Open-world 3d scene understanding from 2d vision-language models. In6th Annual Conference on Robot Learning 2022 Aug 15.
[11] Mendoza-Ramírez, C.E.; Tudon-Martinez, J.C.; Félix-Herrán, L.C.; Lozoya-Santos, J.d.J.; Vargas-Martínez, A. Augmented Reality: Survey. Appl. Sci. 2023, 13, 10491
[12] I. Permozer and T. Orehovački, Utilizing Apple’s ARKit 2.0 for Augmented Reality Application Development, 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 2019, pp. 1629-1634.
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
The Prognostic and Health Management Society advocates open-access to scientific data and uses a Creative Commons license for publishing and distributing any papers. A Creative Commons license does not relinquish the author’s copyright; rather it allows them to share some of their rights with any member of the public under certain conditions whilst enjoying full legal protection. By submitting an article to the International Conference of the Prognostics and Health Management Society, the authors agree to be bound by the associated terms and conditions including the following:
As the author, you retain the copyright to your Work. By submitting your Work, you are granting anybody the right to copy, distribute and transmit your Work and to adapt your Work with proper attribution under the terms of the Creative Commons Attribution 3.0 United States license. You assign rights to the Prognostics and Health Management Society to publish and disseminate your Work through electronic and print media if it is accepted for publication. A license note citing the Creative Commons Attribution 3.0 United States License as shown below needs to be placed in the footnote on the first page of the article.
First Author et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.