A data-driven method for predicting structural degradation using a piezoceramic array

There is a growing use of carbon fiber reinforced polymers (CFRPs) in modern airframes with still a limited understanding of the in-service behavioral characteristics of these structures. Structural Health Monitoring (SHM) technologies that use surface-bonded piezoceramic (PZT) transducers to generate and measure guided waves within these structures have demonstrated promising damage detection and localization results and potential for data gathering in data-driven damage prognosis. This paper investigates the development of a data-driven SHM based damage prognosis system for estimating remaining useful life (RUL) of CFRP coupons following damage initiation. A robust and realistic laboratory data gathering methodology is introduced as a building block for evaluating the feasibility of data-driven damage prognosis for in-service aerospace structures. Data are gathered using a PZT-based SHM system. Using the gathered raw guided wave signals, a number of time and frequency domain features are first extracted which are derived from existing damage imaging and detection algorithms. Then, using various combinations of the feature sets as inputs to generic data mining algorithms, the paper presents estimates of the predicted RUL against actual damage diameter progression.


INTRODUCTION
Modern operation of aircraft generates vast amounts of 'operational data' from on-board aircraft systems and 'maintenance data' from offline maintenance procedures. Prognostic model development maximizes the potential uses of these two data sources with the goal of discovering 'a priori' knowledge of component failures and abnormal system Kyle R Mulligan et.al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
behavior (Zaluski, Létourneau, Bird, & Yang, 2010). Such prognostic tools would be invaluable to end-users of these aircraft in allowing the development of opportunistic and preventive maintenance procedures designed to cut costs and increase safety. Prognostic modelling would close the loop in the development of adequate health management systems for aircraft which are built with the combination of diagnostic, prognostic, and repair management technologies (Létourneau et al., 2005).
In any data-driven prognostic model development approach, data are gathered using a sensory and maintenance network. In the aerospace domain, data analysis of structural damage is tracked off-line by experienced engineers through nondestructive inspection (Roemer, Ge, & Liberson, 2005). Data are therefore collected when a problem is detected and no information is gathered leading up to the event. Without a means to gather data prior to a problem, data-driven prognostic model development is impossible. The use of Structural Health Monitoring (SHM) strategies based on guided wave propagation for injecting and receiving guided waves, using lead-zirconate-titanate (PZT) piezoceramic transducers bonded to aerospace structures for damage detection, is however increasing. This allows data gathered regularly during in-service operation of these aircraft to be used in the development of data-driven prognostic models and eventual decision support systems. Currently, however, SHM systems are not being readily used on in-service aircraft fleets (commercial or military) because the technology has not yet transitioned from research to practice (Chattopadhyay et al., 2012). Efforts to develop health management systems that include damage prognosis are limited because gathered data from SHM systems for real applications are not yet available.
An increasing importance in industry for decision support systems and condition based monitoring (CBM) through on-Figure 1. Overview of the data-driven methodology board SHM comes from a growing use of composite Carbon Fiber Reinforced Polymers (CFRPs) in both military and civilian aircraft fabrication and production. As these aircraft are commencing in-service use, there is limited historical data available for understanding their in-service behavioral characteristics (Soutis, 2005). Damage in aerospace composites under in-service loading is mainly due to fatigue from stresses that vary depending on mission loading profiles. Operational impacts also occur which are less predictable. For CFRP structures, fatigue and impact damages lead to: fiber breakage, matrix cracking, delamination, and interface debonding (O'Brien, 2001;Rhymer, Kim, & Roach, 2012). The result of the initiation of such damage with further use of the aircraft could have detrimental effects on the overall strength of the structure. A desire for the aircraft manufacturer, owner, and user is for the development of on-board SHM systems for fault detection and eventual failure prediction of the remaining useful life (RUL) of these components to enhance opportunistic and preventive maintenance procedures with anticipated impacts on availability, safety, and cost (Zaluski et al., 2010).
With no commonly available operational data to build datasets for data-driven model development, much research has focused on using Material Testing Systems (MTS) in laboratory environments for data gathering and on analytical and Finite Element (FE) models of composite structural behavior (Liu, Mohanty, & Chattopadhyay, 2009). In laboratory environments, RUL estimations have been presented that use features created from data gathered using MTS fatigue systems in combination with acoustic emission or PZTbased SHM systems (Saxena, Goebel, Larrosa, Janapati, & Roy, 2011). Promising results in predicting remaining fatigue loading cycles have been demonstrated on composite coupons but little thought has been put into the types of loading pat-terns that should be used to mimic real applications (Liu et al., 2009;Larrosa & Chang, 2011). One must consider what loading patterns actual aircraft endure in order to develop realistic prognostic models. Furthermore, parameters measured from the MTS system cannot be used as features in the prognostic model development process because these features are not available in real applications and vary by composite structure type.
Many analytical models are available (Hashin, 1980;Chang & Chang, 1987;Roebuck, Gorley, & McCartney, 1989;Choi, 1990;Hou, Petrinic, Ruiz, & Hallett, 2000;Nairn, 2000;O'Brien, 2001;Beaumont, Diamant, & Shercliff, 2006) for understanding damage propagation and the transition from matrix micro-cracks to delamination. In practice, validation of structural components using such models is time consuming and complex and many assumptions for simplification are made when estimating fatigue (Liu et al., 2009). FE models exist to evaluate impact damage size and depth in aluminum sandwich structures (Nguyen, Jacombs, Thomson, Hachenberg, & Scott, 2005) but none, to the authors' knowledge, have been validated and used for RUL prediction. SHM allows a straight-forward data-driven implementation for damage prognosis which in turn provides a practical solution to difficulties faced in physics-based modelling (M. .
To address these issues, this work presents a novel datadriven approach for estimating RUL for CFRP aerospace grade coupons exposed to low energy drop-weight impacts. The approach relies on a PZT-based SHM systems instrumented onto CFRP coupons in a pitch and catch configuration. Data are acquired from guided wave signals generated and received in the coupons and later pre-processed to define a feature set. Observations in the feature set are divided into training and testing datasets where several generic supervised learning algorithms are applied to the data to estimate RUL. The laboratory data is gathered using a methodology designed as a foundation to build upon for simulating real world operating conditions. The purpose of this first step is to evaluate and demonstrate the feasibility of assessing structural damage using damage prognosis.

METHODOLOGY
There is an emerging use of data-driven based damage prognosis in the aerospace domain with the onset of an increasing use of composite structural components. In order to demonstrate the feasibility of damage prognosis for RUL prediction in aerospace grade composite structures following impact events a thorough, robust, and realistic laboratory data gathering procedure must be considered. This methodology consists of three major steps: data generation and processing, feature generation, and modelling and evaluation following operational data gathering. The defined data-driven based methodology can be extrapolated into experiments for developing prognostic models to predict damage degradation or RUL in aerospace structures. This section, together with Fig.  1, describes the details for each of the different stages in the methodology along with observed challenges.

Data generation and processing
In the present project, destructive testing using a drop-weight impact system is used to investigate damage prognosis for estimating RUL. By defining a critical damage size, the RUL is predicted in remaining number of impacts (RNI) from the critical impact point which is estimated for the critical damage size (≃ 7.1 mm). Although somewhat arbitrary in this work, the critical damage size is estimated from nondestructive inspection measurements following a series of impacts over the entire sample size of CFRP coupons used in this work. Important considerations for the methodology must be made because following destructive impacts, data cannot be re-collected. Due to the high cost of destructive testing in aerospace structures, CFRP aerospace grade coupons instead of real aerospace structures are used for operational data gathering. The maintenance database however cannot be created until established maintenance procedures are defined and developed for such CFRPs. Structural fabrication and instrumentation and damage initiation procedures are first described in this section followed by a description for data gathering and processing.

Structural fabrication and instrumentation
A CFRP aerospace grade sheet is fabricated using the facilities at the National Research Council of Canada (NRC) Institute for Aerospace Research (IAR) in Ottawa, Ontario. The sheet is fabricated using pre-preg fibre layers autoclave cured inside a vacuum bag while keeping the tolerance on the ply orientation for the fabrication process below 0.5 o . Following autoclave curing, the CFRP sheet is cut into 9 individual coupons of size 10.16 cm by 15.24 cm using a diamond saw. The composite layup and mechanical properties of the coupons are presented in Table 1.
Following fabrication, instrumentation of the CFRP coupons is performed. Two 5 mm diameter with 0.5 mm thickness PZT transducers (Physik Instrumente R ) are bonded 10 cm apart in a pitch and catch configuration to the surface of the CFRP coupons using epoxy. The properties of the PZT transducers and the epoxy adhesive are outlined in Table 2. Damping tape at the plate edges is used to prevent edge reflections of guided wave signals.  Table 2. Properties of the CFRP layers, PZT transducer, and bonding layer.

Damage initiation
Following instrumentation, coupons are placed individually into a drop-weight impact system for damage initiation. Two steel shims are placed on the upper and lower sides of the coupons and clamped using 5 quick clamps onto the impact system frame. Data acquisition described in the next section, is repeated for each coupon to define the operational database. The experimental setup of the plate is presented in Fig. 2.
ASTM standard 'D7136/D7136M-05' (ASTM, 2007) provides guidelines for impacting CFRPs using a drop-weight impact system such that damage including dents/depressions, splits/cracks, combined splits/delaminations, and combined large cracks with fibre breakage can be reproduced. For the thickness of the coupons used in this work, the standard dictates that an impact energy of 16.7J will induce all of the described failure modes. In other words, following an impact of this magnitude, the coupon would be deemed damaged and would require repair or replacement. For successful development of data-driven data mining prognostic models, the objective is to measure and produce data features that show progressive changes far away from a failure point. This would allow more time for fleet managers to schedule and perform opportunistic maintenance. Therefore, impacting the composite coupons using the standardized energy would not produce any useful data as the coupons would be too severely damaged (Guida, Marulo, Meo, & Russo, 2012) and there would be no justification for damage prognosis.
In reality, sudden impacts do occur which can only be detected and assessed but a more realistic application for damage prognosis in the aerospace domain would be for prediction of RUL after the onset of lower energy impacts from sources such as accidental tool drops or runway debris impacts from take-off and landing (2-3 J, 3-5 m/s) (Tomblin et al., August 1999). These impacts cause barely visible impact damage (BVID) which commonly occurs from constant maintenance and operation of the aircraft. Accumulations of BVID in similar proximities on the airframe surface can eventually lead to severe damage (Rhymer et al., 2012). Therefore, in this work, damage is applied in the form of 5J impacts to each composite coupon to replicate BVID. Actual damage size and depth is measured following each impact using a non-destructive phased array ultrasonic testing tool (Olympus OMNIScan MX2) (gold standard) (Olympus, 2013). These types of damage investigated herein are excellent candidates for RUL estimation as they are pertinent and common.

Data gathering
Guided wave PZT based SHM systems are being explored in current research and show promise in damage detection, localization, and material characterization for aerospace structures and are ideal candidates for in-situ data generation and gathering (Giurgiutiu & Bao, 2004;Quaegebeur, Masson, Langlois-Demers, & Micheau, 2010;Saxena et al., 2011;Ostiguy, Quaegebeur, Mulligan, Masson, & Elkoun, 2012). Before implementing such a system for structural assessment, key considerations must be made for transducer signal generation and acquisition parameters. In this case, the acquired data must be useful for features to be developed for later use in damage prognosis. For signal generation using guided wave propagation, a burst function must be selected with an appropriate amplitude, frequency, number of cycles, and duration (Staszewski, Mahzan, & Traynor, 2009). Features may be developed by investigating time, frequency, dispersive, and correlation aspects of the acquired signals following generation. Some of these features may be sensitive to different frequency ranges which vary by material. Therefore, in this study, considerations made for parameter selection are based on damage analysis and material characterization techniques (Giurgiutiu & Bao, 2004;Quaegebeur et al., 2010;Ostiguy et al., 2012).
For guided wave pitch and catch measurements a broadband frequency range is covered with fine steps between each frequency such that a diverse amount of data are gathered (Quaegebeur, Masson, Micheau, & Mrad, 2012) in order to develop new features not found in literature. The subband generation technique  is used to determine the transfer function between the transmitting and receiving transducers. To do this, an impulse excitation signal which is decomposed over 11 sub-bands over a frequency range below 1 MHz is transmitted using a generating PZT and received by the acquisition PZT. The signals are amplified using a UA-8400 amplification system (Produitson). Baseline guided wave measurements are taken before the impact sequence such that pristine and damage sig-  nals can be later compared. Post-processing of the measured impulse response is performed using windowing and reconstruction filters to determine the transfer function. The input signal voltage burst is generated using an HP 33120A generator with a sampling frequency of 15 MHz. The output of the measurement circuit is acquired using a high impedance National Instruments PCI-5105 12-bit DAQ board configured through a custom LabVIEW interface. The generated signals are recorded at a fixed sampling frequency of 6 MHz and averaged 1000 times in order to increase SNR and low-pass filtered at 1.5 MHz.

Data pre-processing
Data-driven damage prognosis models require as input a dataset which is composed of instances of vectors of attribute values. The attributes and their respective values are extracted from the operational and maintenance databases obtained for specific applications. In the case of composite structures in aircraft, historical databases are not yet available. To build a historical database in this case, considerations must be made for selecting an appropriate SHM system that can provide insitu and regular monitoring of the aircraft structure during future applications for in-service operation.
In order to use supervised learning algorithms, data must be pre-processed. This involves the addition of problem identification and index attributes to the gathered operational data. Problems separated by problem identification numbers are created by applying the data gathering procedure to inde-pendent aerospace structures. With an attribute for identifying each problem, the data are split into training and testing datasets. For each of the training problems, an index attribute is added to each instance for supervised learning. The index attribute associates a number to each instance within each problem in sequential order where the failure point is generally labeled '-1' and the furthest point from the problem is '-N', where N is the total number of instances within a problem. In model training, the index attribute is used to train the predictive model for each individual observation. In model testing, testing data is inputted into the trained models and the index attribute for each observation is estimated. The predicted index for each instance in the testing dataset is compared to the actual index by replacing the index attribute for each instance. Depending on the quality and structure of the gathered data, data labelling can be performed at this stage or later, following feature generation.

Feature generation
Features are generated from raw data measurements using data transformations to improve the initial, as measured, representation (Zaluski et al., 2010). This is done by augmenting the initial representation with new features created using methods from process physics, signal processing, timedomain and frequency analysis, wave dispersion and correlation, and constructive induction. Different and more complex features can then be extrapolated from these findings. Once a feature set is defined, attribute evaluation tools (Hall, 2000;  . Method for extracting the one sample instance of the dispersion feature from a measured pitch and catch signal. Kira & Rendell, 1992) and domain knowledge can be used to remove redundant or irrelevant features to optimize model development and reduce computation time.
Guided wave measurements are used to build a feature set with instances for each of the 9 coupons. A total of 151 parameters are extracted based on time and frequency domain signals. An example of guided wave signals obtained using sub-band generation are provided in Fig. 3, with (c and d) and without (a and b) coupon damage showing the time and frequency domain representations. The difference between the undamaged and damage signal is shown in Fig. 3 (e) for the time domain and frequency domain in Fig. 3 (f). From the difference between the signals alone, slight differences can be observed. Therefore, both domains are considered in the feature extraction process. Once features are extracted, they are assembled into a database with the addition of problem identification and index attributes. Based on the problem identification attribute the database is separated into training and testing datasets. Each type of features generated for the two domains are described in the following and summarized in Tab. 3.

Time domain features
In this section, features extracted from time domain representation of signals are described. This includes a feature that uses the root mean square (RMS) of the time domain signal, a feature that exploits wave dispersion commonly found in SHM, and finally a wave correlation feature.

Root Mean Square feature
The RMS of reconstructed broadband signals initially generated and measured in a pitch and catch configuration where a hole is increasingly grown in the path between the two trans-ducers has shown to increase linearly in  for experiments on aluminum plates. Although numerous impact damages will not perforate the CFRP plates as a hole, a dent with increasing diameter will be produced. As the penetration depth increases, generation and reflections of guided wave modes should become more prominent and effects on the measured signal RMS should also be observed. With a significance reported in the signal RMS, the mean, median, standard deviation, variance, minimum, and maximum are also calculated and plotted against the respective impact number. A total of 7 parameters are therefore extracted in this step.

Wave dispersion feature
Wave dispersion commonly occurs in SHM when generating and measuring guided wave bursts. Depending on material properties, when frequencies that compose a generated wave burst do not propagate with the same velocity, the measured wave burst becomes stretched in time. This phenomena depends on material characteristics. Generally most algorithms used in damage imaging are implemented in conditions such that dispersion is avoided. These algorithms depend on the Time of Flight (ToF) of a series of measured wave bursts which is not easily measured accurately in the presence of dispersion. In this work, an attempt to exploit dispersion as a feature is presented. The approach for calculating the dispersion feature presented in Fig. 4 is similar to that of obtaining the Minimum Resolvable Distance (MRD) parameter presented in (Wilcox, Lowe, & Cawley, 2001).
Using the impulse response obtained from post-processing measured signals using the sub-band generation technique , output signals for frequencies between 25 kHz -800 kHz at steps of 25 kHz (32 fre- quencies) are generated using synthetically constructed wave bursts. The maximum amplitude over the entire signal is then selected using a peak detection algorithm. Two peaks before and after the maximum peak are determined with amplitudes that are less that 20 dB of the maximum peak. The purpose of this is to isolate the first reception of the propagated wave bursts. The time difference between the outer peak is then calculated. The time difference is plotted for each frequency against the respective impact number for a total of 32 features.

Wave correlation feature
Extraction of the wave correlation feature set from time domain signals is presented in Fig. 5. Time domain signals are first bandpass filtered according to 5 frequency ranges namely: low frequency (10 kHz -200 kHz) and below PZT resonance frequency (200 kHz -400 kHz) constituting the low frequency range (Fig. 5), PZT resonance frequency (400 kHz -600 kHz), and above PZT resonance frequency (600 kHz -800 kHz) and high frequency (800 kHz -1010 kHz) constituting the high frequency range (Fig. 5). The maximum cross-correlation of each frequency range is calculated with respect to the baseline acquired prior to destructive testing. A buffer is used to store the maximum correlation result obtained by cross-correlating all impacts with the baseline for each frequency range. The RMS, mean, median, standard deviation, variance, maximum, and minimum of all maximum correlation values for each frequency range is calculated for a total of 35 features.

Frequency domain features
In this section, features extracted from the frequency domain representation of signals are described. This includes a fea-ture that uses the Power Spectral Density (PSD) of the frequency domain signal and features that exploit the amplitude and phase of the transfer function for guided wave signals propagating in the coupons.

Power Spectral Density feature
The PSD is investigated based on features used in (Larrosa & Chang, 2011). The PSD is calculated from the reconstructed broadband signals initially generated and measured in a pitch and catch configuration. Then, the RMS, mean, median, standard deviation, variance, minimum, and maximum of the PSD are calculated and plotted against the respective impact number. Another 7 parameters are therefore extracted in this step.

Transfer function feature
The transfer function in the frequency domain can be extracted using the sub-band generation technique when used in pitch and catch . This provides an amplitude and phase relationship for frequencies below 1 MHz. Based on observation and results reported in (Mulligan, Quaegebeur, Ostiguy, Masson, & Létourneau, 2013), five frequency ranges of the transfer function are isolated for feature extraction based on transducer resonance frequency: low frequency (< 200 kHz), below PZT resonance frequency (200 kHz -400 kHz), PZT resonance frequency (400 kHz -600 kHz), above PZT resonance frequency (600 kHz -800 kHz), and high frequency (> 800 kHz). For each frequency range the RMS, mean, median, standard deviation, variance, minimum, and maximum of the amplitude and phase are calculated and plotted against the respective impact number for a total of 70 parameters.  Table 3. Summary of parameters extracted from the raw guided wave signals used in the prognostic model development process.

Modelling and evaluation
Following data gathering and processing, and feature generation, any supervised learning algorithm can be applied to create prognostic models. Prognostic models are created and evaluated using training and testing datasets respectively. Often an iterative process by which features are added and removed is used to optimize model accuracy and reduce the feature set to remove redundant or irrelevant features. By comparing the results of each model, statistical analysis of model performance is conducted by comparing the predicted index to the actual index. In this case the index corresponds to the remaining number of impacts (RNI). Main statistics including: mean, standard deviation, and mean square error (MSE) of the error between the predicted and actual RNI are also calculated to evaluate the robustness of each model.
With 151 total features, initial reduction of the feature set is performed using the attribute selection tool in the Weka suite of machine learning algorithms to reduce the size of the feature set for model training and testing (Witten & Frank, 2005). The attribute selector requires the addition of a class variable which is labelled as '1' for observations close to the failure point and '0' otherwise. Therefore, 7 impacts closest to the failure point are selected to be labelled as the damage size becomes severe at this point (depth = 0.45-0.55 mm, diameter = 4-7 mm). A wrapper subset attribute evaluator using a 'Naive Bayes' classifier is selected with the search functions: best first, genetic algorithm, greedy stepwise, rank search, random search, and exhaustive search. These algorithms provide a starting point to define a good feature set for model training and testing. Domain knowledge is also used to add and remove features that are neglected or suggested from the attribute selectors. A number of new input feature sets can therefore be defined at this stage.
After the feature sets are assembled into a database, 5 of the 9 problems (plates) are used for model training and the other 4 are used for testing. Then, a Leave One Batch Out (LOBO) cross-validation technique is applied using the best feature set where each problem is chosen individually to test the model trained by the other problems (8 training, 1 testing) in a round-robin fashion. LOBO is used to assess the variability between the gathered datasets (training and testing) processed by each model and to test model performance when smaller datasets are used (Zaluski et al., 2010). Four generic regression based models are used in this work: sequential minimal optimization (SMO) support vector machine, multiple perceptron artificial neural network (ANN), linear regression (LR), and least mean square (LMS). Based on the results of the attribute selection step, feature subsets can be extracted from the global feature set database and used for training and testing each model. In order to iterate the data-driven model prognostic model development process to obtain accurate prognostic models using feature subsets, an automated system named EBM3 (Environment for Building Models for Machinery Maintenance) is used and presented in (Zaluski et al., 2010). In addition to speeding up execution, the EBM3 system allows the research team to maximize reuse of software components and experimental methodologies between applications (Zaluski et al., 2010). Over 200 experiments in EBM3 are performed using the two split configuration and various feature subsets. The goal of each experiment is to obtain the best prediction results while minimizing the required number of features. In this case, good estimation results are obtained using only 11 features of the original 151. The consistency for model training and testing of the feature subsets yielding the best model is later evaluated using LOBO.

Model performance
When selecting and separating training and testing datasets, one assumes that the same model evaluation results can be obtained for any combination. In other words, any combination of problem (plate) pairs used for training and testing datasets should provide the same results. In reality, some datasets are more robust than others in that similar tendencies are found within the data. If data leading up to a problem (plate) event behaved in a similar manner in every case, structural RNI prediction would be easy. Cross-validation provides a sense of the similarities between each dataset and tests model performance when smaller datasets are used. The cross-validation results are provided in remaining number of impacts in Tab overall. In all cases where large MSE is found, the final two observations closest to the failure point have large estimation errors. This indicates that the SMO, LR, and LMS models may perform optimally at estimating the RNI farther away from the failure point. In a next step, this may be useful when performing data clustering (Chen, Han, & Yu, 1996).

Model results
In order to associate a number of impacts to a critical damage size (failure threshold), the non-destructive phased array ultrasonic testing tool is used following each destructive impact to measure damage diameter versus impact number. Fig. 6 presents an experimental damage size versus number of impacts curve obtained by taking an average of all nondestructive phase array ultrasonic measurements for the training and testing coupons. A failure threshold is defined at impact number 25 corresponding to a damage diameter of ≃ 7.1 mm. In a real application, this threshold point is defined from analytical and experimental tests on the material. Model estimates using the testing dataset in remaining number of impacts are associated with a corresponding damage size based on the experimental damage size measurements for each testing plate and plotted in Fig. 6 with the average experimental damage size versus impact number measurements. From the figure, for an impact number lower than 5, the linear regression through the damage size estimation points suggests that damage exists prior to detection by the ultrasonic testing tool. Overall however, both the experimental damage size measurements and damage size estimation points increase linearly towards the failure threshold as the impact number increases. The damage size estimation model however is underestimating damage size shown by the slope of the regression fit to the damage size estimation points which increases at a slower rate than that of the experimental damage size measurements. This is caused by estimation errors between 15 and 20 impacts as features extracted from the guided wave signals must suffer from a reduced sensitivity in this range.
The model results for estimated RNI plotted against actual RNI are presented in Fig. 7. The ANN algorithm is evaluated using the testing dataset composed of a subset of 11 features from the original 151 including: time domain signal mean, low PZT frequency range mean, low PZT resonance frequency range median, PZT resonance frequency range median and standard deviation, high PZT resonance frequency range maximum, high PZT resonance frequency phase mean and variance, variance of the PSD, and the low frequency range correlation variance. The model is capable of estimating RNI within a 30 impact range for all instances in the training and testing datasets. In Fig. 7 the slope of the linear regression fit for points within a 25 impact window is 0.51 with an intercept of 9.6. A slope close to unity with an intercept at 0 in training would represent an ideal model. If the model is then tested using a testing dataset demonstrating similar tendencies found in the training dataset, the prediction should match the actual parameter of interest (in this case RNIs) for all instances. In reality however, it is difficult to create testing datasets without variation from the training dataset in the data-driven approach. The slope of the linear regression fitted to the model results using the testing dataset is 5% lower than in the training case. This means that the model is over estimating the RNIs and a failure is detected by the model sooner than it actually occurs. In a real application, this would increase costs as maintenance procedures would be initiated before being necessary. In other words, structural components would be over maintained (Zaluski et al., 2010;Létourneau et al., 2005). Looking at the points close to the failure point, the model does however succeed at estimating the RNIs within range. This indicates that with more problems (plates), a larger number of points could converge in this area. This would potentially increase the slope of the linear regression fit while reducing the intercept across the ordinate. Performance assessment based on model error indicates a RNI estimation error for training of 3.7±3.2 and 8.4±6.7 for testing and a MSE value of 24 for training and of 114 for testing.
Combining previous results leads to a more useful representation of RNI versus damage diameter, presented in Fig. 8. RUL in remaining number of impacts (RNI) is obtained by subtracting each estimated diameter for each impact number shown in Fig. 6 from the failure threshold (≃ 7.1 mm) and  4. By constructing a plot of RNIs versus damage diameter, RUL can be predicted by tracing a linear regression through the points. Note that in order to plot a linear regression, at least two points are required. As the number of points increases, the prediction accuracy also increases.

DISCUSSION
To summarize, using a data-driven approach for estimating RNI following the initiation of impact damage in CFRP aerospace structures, desirable model performance is achieved based on operational data gathered in a laboratory environment. Key considerations are presented in this paper for defining a robust and realistic laboratory data acquisition and damage initiation methodology. This is an important foundation to build upon such that new features relevant to damage prognosis can be explored before higher scale attempts are made on real aircraft. Evidently guided wave measurements using SHM technologies while used in-service are subject to variations induced by temperature (Konstantinidis, Wilcox, & Drinkwater, 2007;Croxford, Moll, Wilcox, & Michael, 2010), hydrostatic pressure (Croxford et al., 2010), impact damages themselves among other factors. In fact, damage has been shown in (Mulligan, Ostiguy, Masson, Elkoun, & Quaegebeur, 2011;Mulligan, Masson, Létourneau, & Quaegebeur, 2011;Mulligan, Quaegebeur, Masson, & Létourneau, 2012;Mulligan, Quaegebeur, Masson, Brault, & Yang, 2013) to significantly influence guided wave signals by degradation to the bonding layer between the transducer and the structural surface. Compensation strategies for temperature have been proposed in (Croxford et al., 2010). A signal correction factor (SCF) has been proposed in  to compensate for bonding layer degradation based on transducer electrical admittance measurements demonstrated in (Park, Park, Yun, & Farrar, 2009) as a potential bonding layer degradation assessment metric. With a robust foundation for a data gathering methodology by which data are acquired with consistent damages that mimic potential in-service damages, investigations into these variations are now possible by building on the current methodology.
Four types of features including time and frequency domain, wave dispersion and correlation features are presented in this work. Although the best results used only a combination of time domain, frequency domain, and correlation features, those results including wave dispersion features were not far off with slopes and intercepts of the predicted versus actual RNI within 5% while correctly estimating remaining impact numbers within range for most instances. Variations between the estimation, prediction, and measurement results are likely related to variability between the data gathered from the testing plates which is indicated in the cross-validation. These uncertainties in part can be attributed to bonding layer degradation of the transducers due to the initiation of impacts. Damage initiation using drop-weight impacts has been shown to induce transducer bonding layer degradation (Mulligan, Quaegebeur, Ostiguy, et al., 2013;. Bonding layer degradation in any amount influences the amplitude and phase of guided wave bursts measured in a pitch and catch configuration. Changes to the amplitude and phase of a guided wave burst can drastically influence wave amplitude, phase, dispersion and correlation. Variability is introduced through comparison of these features with baseline measurements made prior to damage initiation which are not influenced by impact damage. Another cause for low model performance could be that data clustering is not used in this work. Clustering is commonly used in data-driven damage prognosis where instances of the feature set can be grouped into similar categories. In this work, based on the loading pattern, instances could be separated into 4 sections based on damage severity. Clustering has demonstrated that some features are more sensitive in some categories more than others (Chen et al., 1996). Per-haps, the wave dispersion features could perform better in assessing certain severities of damage than others. Furthermore, model evaluation has shown that some models may be more appropriate for estimating RUL further from the critical damage size. Using clustering, these models could only be applied to such observations. With these considerations in mind, a number of questions remain to be answered in this domain. However, promising results demonstrating that impact damage is predictable are shown in this work. To overcome the aleatory uncertainty in the training process, the addition of more and different problems, features, clustering, and data correction techniques, is required.

CONCLUSION
In this work, a novel data-driven approach for estimating RNI and damage size for CFRP aerospace grade coupons exposed to drop-weight impact damage is presented. The approach relies on a PZT based SHM system instrumented onto the CFRP coupons in a pitch and catch configuration for data acquisition. Time and frequency domain data are acquired for training and testing several generic regression based machine learning models on several CFRP coupons over a specified loading pattern. The laboratory data is gathered using a methodology designed as a foundation to build upon for simulating real world operating conditions. The purpose of this first step is to evaluate the feasibility of assessing structural damage using damage prognosis.
In future work, signal correction presented in  will be applied to the data gathered data to compensate for bonding layer degradation induced by the impact damage. This will assure that only structural damage is being assessed not to include damage to the PZT SHM system. The acquisition of data for new problems will reduce the variability observed in the cross-validation step of this work while hopefully improve results of the predicted versus actual RUL estimating resulting adding confidence for eventual testing on in-service aircraft.
The ultimate goal of building upon this work is to determine if there exists a feature set capable of revealing damage tendencies prior to detection using a non-destructive phased array ultrasonic testing system and to determine if data mining and machine learning combined with SHM provide equivalent or more accurate measures of damage compared to ultrasonic systems.
Nicolas Quaegebeurobtained his bachelor's degree in engineering physics from Ecole Nationale de Techniques avancées (EnsTa ParisTech, France) and his master's degree from ATIAM (IRCAM, University of Paris and Telecom ParisTech) with specialization in acoustics, signal processing, and informatics applied to music in 2004. He obtained his Ph.D. degree on nonlinear vibrations and acoustic radiation of electrodynamic transducers from Ecole Polytechnique (Paris-Tech, France) in 2007. He joined the GAUS group at the Université de Sherbrooke for a postdoctoral fellowship on active vibration control in 2008 and on structural health monitoring using guided waves in 2009, before becoming a research assistant in the same group in 2011. His research interests include nonlinear vibrations, advanced signal processing, active vibration control, piezoelectric transducers, guided wave propagation modelling, and structural health monitoring.
Patrice Masson received his bachelor's degree in engineering physics and his master's degree in mechanical engineering fromÉcole Polytechnique de Montréal, Canada, in 1989 and 1991, respectively. He obtained his Ph.D. degree in mechanical engineering from the Université de Sherbrooke, Canada, in 1997. In 2000, he became a faculty member in the Mechanical Engineering department of the Université de Sherbrooke, where he is now the director of the Acoustics and Vibration Group of the Université de Sherbrooke (GaUs). He has contributed to establish the new curriculum and is teaching the mechatronics courses. Prof. Masson leads or co-leads multi-institutional projects within AUTO21 and the CRIAQ. His research interests include structural health monitoring, signal processing, smart materials and structures, and active noise and vibration control. Prof. Masson is a member of the acoustical society of america (ASA), the society of automotive Engineers (SAE), the International society for optical Engineering (SPIE) and the ordre des Ingénieurs du Québec (OIG).