A Model Based Approach to Extract Health Information from Textual Data

Diego Mandelli; Congjian Wang

doi:10.36001/phmconf.2022.v14i1.3249

A Model Based Approach to Extract Health Information from Textual Data

PDF Slides (PDF)

Published Oct 28, 2022

DOI https://doi.org/10.36001/phmconf.2022.v14i1.3249

Diego Mandelli

INL

Congjian Wang

INL

Abstract

In current nuclear power plants (NPPs) a large amount of condition-based data is being generated and stored to assess and monitor component health and performance. The format of this data can be either numeric (e.g., pump vibration data) or textual (e.g., condition report which assess component health). While assessing component health from numeric data can be performed with a large variety of methods, the extraction of information from textual data still remains a challenge. Natural language processing (NLP) methods are starting to be deployed in current NPPs mainly to filter out incident reports (IRs) that are not safety related by employing supervised machine learning methods. However, these methods do not really provide the quantitative information that might be contained in IRs. This paper presents an approach to extract information from textual data (e.g., from IRs, maintenance reports) that is based on NLP data analytics methods coupled with model based system engineer (MBSE) models. NLP methods are employed to perform syntactic and semantic analyses. Syntactic analysis analyzes the grammatical structure of a sentence; such analysis includes: part of speech (POS) tagging (i.e., identification of grammatic elements of each string - e.g., nouns, verbs), named entity recognition (i.e., identification of text entities - e.g., names, dates, events), and relation extraction (e.g., coreference resolution). On the other hand, semantic analysis is designed to analyze the logic structure of a sentence. Through a specific set of rules, our methods can identify whether a sentence contains health information of a component (e.g., degraded performance, anomaly behavior) or the causal relationship between two events (i.e., a cause-effect pair). An innovative element of our approach is that semantic analysis relies on MBSE models to identify links between textual elements. MBSE are diagrams designed to represent system and component dependencies (from both a form and functional point of view). In our approach, MBSE models emulate system engineer knowledge about component/system architecture. This paper presents in detail how the integration of NLP methods and MBSE models is performed. Few analysis examples focusing on centrifugal pumps will be presented.

How to Cite

Mandelli, D., & Wang, C. (2022). A Model Based Approach to Extract Health Information from Textual Data. Annual Conference of the PHM Society, 14(1). https://doi.org/10.36001/phmconf.2022.v14i1.3249

Abstract 838 | PDF Downloads 641 Slides (PDF) Downloads 192

Keywords

NLP, health assessment

References

Xingang, Z., Kim, J., Warns, K., Wang, X., Ramuhalli, P., Cetiner, S., Kang, H. G., & Golay M. (2021). Prognostics and Health Management in Nuclear Power Plants: An Updated Method-Centric Review with Special Focus on Data-Driven Methods. Frontiers in Energy Research, vol. 9. DOI=10.3389/fenrg.2021.696785
Lane, H., Hapke, H., & Howard, C. (2019). Natural Language Processing in Action: Understanding, analyzing, and generating text with Python. Manning Publications.
Dori, D., Crawley, E. (2002). Object-Process Methodology: A Holistic Systems Paradigm. Springer ed.
Doan, S., Yang, E. W., Tilak, S. S., Li, P. W., Zisook, D. S., Torii, M., (2019). Extracting Health-Related Causality from Twitter Messages Using Natural Language Processing. BMC Medical Informatics and Decision Making, vol. 19, pp. 71–8.

Issue

Vol. 14 No. 1 (2022): Proceedings of the Annual Conference of the PHM Society 2022

Section

Technical Research Papers

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

The Prognostic and Health Management Society advocates open-access to scientific data and uses a Creative Commons license for publishing and distributing any papers. A Creative Commons license does not relinquish the author’s copyright; rather it allows them to share some of their rights with any member of the public under certain conditions whilst enjoying full legal protection. By submitting an article to the International Conference of the Prognostics and Health Management Society, the authors agree to be bound by the associated terms and conditions including the following:

As the author, you retain the copyright to your Work. By submitting your Work, you are granting anybody the right to copy, distribute and transmit your Work and to adapt your Work with proper attribution under the terms of the Creative Commons Attribution 3.0 United States license. You assign rights to the Prognostics and Health Management Society to publish and disseminate your Work through electronic and print media if it is accepted for publication. A license note citing the Creative Commons Attribution 3.0 United States License as shown below needs to be placed in the footnote on the first page of the article.

First Author et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Abstract

How to Cite

##plugins.themes.bootstrap3.article.details##

Most read articles by the same author(s)