AI+AR based Framework for Guided Visual Diagnosis of Equipment

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Nov 5, 2024
Teresa Gonzalez Diaz Xian Yeow Lee Huimin Zhuge Lasitha Vidyaratne Gregory Sin Tsubasa Watanabe Ahmed Farahat Chetan Gupta

Abstract

Automated solutions for effective support services, such as failure diagnosis and repair, are crucial to keep customer satisfaction and loyalty. However, providing consistent, high quality, and timely support is a difficult task. In practice, customer support usually requires technicians to perform onsite diagnosis, but service quality is often adversely affected by limited expert technicians, high turnover, and minimal automated tools. To address these challenges, we present a novel solution framework for aiding technicians in performing visual equipment diagnosis. We envision a workflow where the technician reports a failure and prompts the system to automatically generate a diagnostic plan that includes parts, areas of interest, and necessary tasks. The plan is used to guide the technician with augmented reality (AR), while a perception module analyzes and tracks the technician’s actions to recommend next steps. Our framework consists of three components: planning, tracking, and guiding. The planning component automates the creation of a diagnostic plan by querying a knowledge graph (KG). We propose to leverage Large Language Models (LLMs) for the construction of the KG to accelerate the extraction process of parts, tasks, and relations from manuals. The tracking component enhances 3D detections by using perception sensors with a 2D nested object detection model. Finally, the guiding component reduces process complexity for technicians by combining 2D models and AR interactions. To validate the framework, we performed multiple studies to:1) determine an effective prompt method for the LLM to construct the KG; 2) demonstrate benefits of our 2D nested object model combined with AR model.

How to Cite

Gonzalez Diaz, T., Lee, X. Y., Zhuge, H., Vidyaratne, L., Sin, G., Watanabe, T., Farahat, A., & Gupta, C. (2024). AI+AR based Framework for Guided Visual Diagnosis of Equipment. Annual Conference of the PHM Society, 16(1). https://doi.org/10.36001/phmconf.2024.v16i1.3909
Abstract 34 | PDF Downloads 35

##plugins.themes.bootstrap3.article.details##

Keywords

Automated diagnostic systems, knowledge graphs, relation extraction, large language models, scene understanding, object detection, augmented reality

References
[1] Hütten, N.; Alves Gomes, M.; Hölken, F.; Andricevic, K.; Meyes, R.; Meisen, T. Deep Learning for Automated Visual Inspection in Manufacturing and Maintenance: A Survey of Open- Access Papers. Appl. Syst. Innov. 2024, 7, 11.
[2] Jang, J.; Shin, M.; Lim, S.; Park, J.; Kim, J.; Paik, J. Intelligent Image-Based Railway Inspection System Using Deep Learning-Based Object Detection and Weber Contrast-Based Image Comparison. Sensors 2019, 19, 4738.
[3] Gonzalez, Teresa. et. al Guided Visual Inspection enabled by AI-based Detection Models. (2021). 1-8. 10.1109/ICPHM51084.2021.9486573.
[4] Shalaby, W., Arantes, A., GonzalezDiaz, T.,; Gupta, C. (2020, June). Building chatbots from large scale domain-specific knowledge bases: Challenges and opportunities. In 2020 IEEE International Conference on Prognostics and Health Management (ICPHM) (pp. 1-8). IEEE.
[5] Cabot, P., Navigli, R. REBEL: Relation Extraction By End-to-end Language generation. In Findings of the Association for Computational Linguistics: EMNLP 2021
[6] Gilardi, F., Alizadeh, M., & Kubli, M. (2023). Chatgpt outperforms crowd-workers for text-annotation tasks. Proceedings of the National Academy of Sciences 2023
[7] Wang, C., Liu, X., Song, D. (2020). Language models are open knowledge graphs. arXiv preprint arXiv:2010.11967.
[8] Wadhwa S, Amir S, Wallace BC. Revisiting Relation Extraction in the era of Large Language Models. Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:15566-15589
[9] Wang, K., Lin, Y., Weissmann, B., Savva, M., Chang, A. Ritchie, D. (2019). PlanIT: planning and instantiating indoor scenes with relation graph and spatial prior networks. ACM Transactions on Graphics. 38. 1-15.
[10] Ha H, Song S. Semantic abstraction: Open-world 3d scene understanding from 2d vision-language models. In6th Annual Conference on Robot Learning 2022 Aug 15.
[11] Mendoza-Ramírez, C.E.; Tudon-Martinez, J.C.; Félix-Herrán, L.C.; Lozoya-Santos, J.d.J.; Vargas-Martínez, A. Augmented Reality: Survey. Appl. Sci. 2023, 13, 10491
[12] I. Permozer and T. Orehovački, Utilizing Apple’s ARKit 2.0 for Augmented Reality Application Development, 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 2019, pp. 1629-1634.
Section
Technical Research Papers

Most read articles by the same author(s)