A Semantic Similarity Model to Compare Heterogeneous Data Sources to Augment Engineering Data with New Failure modes in Automotive Industry

Dnyanesh Rajpathak; John Cafeo

doi:10.36001/phme.2021.v6i1.2887

A Semantic Similarity Model to Compare Heterogeneous Data Sources to Augment Engineering Data with New Failure modes in Automotive Industry

PDF

Published Jun 29, 2021

DOI https://doi.org/10.36001/phme.2021.v6i1.2887

Dnyanesh Rajpathak

a:1:{s:5:"en_US";s:14:"General Motors";}

John Cafeo

Abstract

In real life industry, in-time exposure of symptoms and their failure modes observed during fault events provide key signals to system engineers to take necessary corrective actions for effectively arresting product defects. A real-life ontology-based semantic similarity system is employed for automatic comparison of engineering data (comes in the form of design failure mode effect analysis, DFMEA data) with the field repair data collected during product warranty period.

Given the complexity of engineering data and the overwhelming volume (hundreds of millions of data points) of field repair data makes identification of new symptoms and failure modes from the first principles an impractical task. Typically, the engineering data is recorded by using technical vocabulary, e.g. unstable electric contact, Seat Belt comfort per MVSS208, whereas the field repair data is highly unstructured in nature. Consequently, we observe following types of noises in the field repair data – abbreviated text entries, inconsistent use of vocabulary (‘seat buckle is damaged’ vs ‘buckle unlatching’), and finally the incomplete text entries. More importantly, limited mental mapping capacity of a human agent limits discovery of new symptoms and failure modes from the industrial scale data. Not surprisingly, the text mining and semantic similarity are gaining a serious attention due to their ability to automatically discover the knowledge assets buried in unstructured text by training machines to compare and link high volume of data.

In our approach, initially the key constructs (e.g. symptoms, failure modes) from the data are annotated by using the domain ontology. These constructs are then used to construct pairs of terms and pairs of tuples, which are used to compute pair-to-pair and tuple-to-tuple semantic similarity respectively. Finally, the text-to-text semantic similarity is calculated by combining other two semantic similarity scores. It is used to determine whether new symptoms or failure modes from the field repair data can be used to augment the DFMEA data.

The proposed method is implemented as a prototype tool and its performance is validated by using real-life data from automobile domain. On an average, our system has F1 score of 0.75 and 0.78 in discovering and identifying new symptoms and synonym symptoms respectively, whereas it achieved the F1 score of 0.72 and 0.68 in discovering new failure modes and in identifying synonym failure modes respectively. The fault detection rate is improved by 35%, whereas the fault isolation rate is improved by 40.5%.

How to Cite

Rajpathak, D., & Cafeo, J. (2021). A Semantic Similarity Model to Compare Heterogeneous Data Sources to Augment Engineering Data with New Failure modes in Automotive Industry. PHM Society European Conference, 6(1), 10. https://doi.org/10.36001/phme.2021.v6i1.2887

Abstract 805 | PDF Downloads 627

Keywords

Model-based diagnostics, PHM for Automotive, Rail, Marine, Wind and Energy

Issue

Vol. 6 No. 1 (2021): Proceedings of the European Conference of the PHM Society 2021

Section

Technical Papers

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

The Prognostic and Health Management Society advocates open-access to scientific data and uses a Creative Commons license for publishing and distributing any papers. A Creative Commons license does not relinquish the author’s copyright; rather it allows them to share some of their rights with any member of the public under certain conditions whilst enjoying full legal protection. By submitting an article to the International Conference of the Prognostics and Health Management Society, the authors agree to be bound by the associated terms and conditions including the following:

As the author, you retain the copyright to your Work. By submitting your Work, you are granting anybody the right to copy, distribute and transmit your Work and to adapt your Work with proper attribution under the terms of the Creative Commons Attribution 3.0 United States license. You assign rights to the Prognostics and Health Management Society to publish and disseminate your Work through electronic and print media if it is accepted for publication. A license note citing the Creative Commons Attribution 3.0 United States License as shown below needs to be placed in the footnote on the first page of the article.

First Author et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Abstract

How to Cite

##plugins.themes.bootstrap3.article.details##