Carbide Coated Insert Health Monitoring Using Machine Learning Approach through Vibration Analysis

Growth in the manufacturing sector demands extensive production with precision, accuracy, tolerance, and quality. These essential factors need to be ensured for any kind of job. The listed factors stated above depend upon the condition of the tool used for manufacturing. A lot of methods have been proposed for the tool condition monitoring, based on the data acquired through acquisition techniques. Despite the continuous intensive scientific research for more than a decade, the development of tool condition monitoring is an on-going attempt. The proposed method deals with monitoring the health condition of the carbide inserts using vibration analysis. The statistical information extracted from the vibration signals was analyzed using machine learning approach in order to predict the tool condition.


INTRODUCTION
In manufacturing industries, single point cutting tool inserts and drill bits are widely used as a tool for machining the components.The machining operation will be carried out continuously if there is an excessive demand.Every time an operator has to monitor the finish of the material for ensuring the quality as per standards.Any deviation can lead to a huge loss to the manufacturing firm.In such scenario, the machine idle time is increased.Moreover, the production process also gets affected.Hence an approach which can help the operator to identify the condition of the tool for saving the time and reducing the losses caused due to machine downtime is imposed.Tool condition monitoring is one such approach which can help for minimizing these losses in order to achieve better production rates.Kurada & Bradley (1997) indicated that the tool breakage is equally responsible for losses not only in terms of time but the invested capital are also destroyed due to unscheduled stoppages which are about 7-20%.
Sometimes it is not necessary that the tool may break but wear of tool can also add to losses since the use of dull or wear tool can add strain to machine system and can cause a loss in terms of quality of finished work piece.Karandikar, Ali Abbas and Schmitz (2013) proposed a basian model for finding the remaining life time of a turning tool by measuring the tool wear.
Use of the same tool for a prolonged period will lead to wear.The wear of the tool is directly proportional to the vibrations.The use of worn out tool often leads to increase in vibrations.These vibrations have been considered as an important aspect of the tool condition monitoring methods.By analyzing these vibration signals, the condition of the tool can be identified.In many tool condition monitoring studies, the vibration signals were used for predicting the tool condition (Das, Chattopadhyay & Murthy, 1996); (Dimla, 2000).The vibrations that are caused can be extracted or captured using sensors like accelerometer of which the signals can be processed using signal processing techniques (Chen & Chen, 1999).Bernhard sick (2002) reported vibration based online tool wear monitoring techniques for tool condition monitoring.In various industrial applications, the vibration signals have been used for condition monitoring (Anil Kumar, Gurmeet Singh & Naikan, 2015); (Chao Jin, Ompusunggu, Zongchang Liu, Ardakani, Fredrik Petre & Jay Lee, 2015).Hence in this study vibration signals have been considered for the condition monitoring process.
The acquired vibration signals under all good and faulty conditions need to be analyzed for obtaining the tool condition.There were many approaches for monitoring the condition through vibration analysis.In recent years, machine learning has been adopted for monitoring measures in many applications (Witten, Eibe & Ian, 2002).Machine learning is an area of artificial intelligence developing a model which can learn from the data.More specifically, machine learning is a method for creating an algorithmic model through data analysis.Machine learning approach develops a model to train, test and classify the data based on its information (Elangovan, Sakthivel, Saravanamurugan, Nair & Sugumaran, 2015).The machine learning approach consists three steps: feature extraction, feature selection, and feature classification.
The vibration signal consists the information as features like statistical (Lu & Jain, 2006) (Jegadeeshwaran & Sugumaran 2015), histogram (Yang Bai, Lihua Guo, Lianwen Jin, & Qinghua Huang, 2009), wavelet (Yen, Gary & Lin, 2000), (Abdulhamit, 2007).The statistical features are the basis for the histogram and wavelets.In a study statistical learning was suggested for identifying faults in a multi-stage manufacturing process (Xiaorui Tong, Ardakani, David Siegel, Ellen Gamel & Jay Lee, 2017).In this study, statistical features extracted from the vibration signals were used for monitoring the tool condition.
Feature selection was the next step after feature extraction.Feature selection is a process of removing insignificant features from the data set.Many techniques like decision tree, principal component analysis, have been reported for feature selection.Song, Guo, and Mei (2010) proposed a principal component analysis for feature selection.However, PCA always relies on linear assumptions.PCA models have trouble with large numbers of data points.Dash and Liu (1997) proposed decision tree for selecting good features.A novel hybrid classification system based on J48 algorithm was proposed for both feature selection and feature classification of the multi-class problems.Sugumaran, Muralidharan, and Ramachandran (2007) used decision tree for selecting good features from the extracted features of the roller bearing.
Elangovan, Babudevasenapathy, Sakthivel, and Ramachandran (2011) used decision tree for feature selection in the tool condition monitoring study.Decision tree can be represented more compactly as an influence diagram.Hence, in the present study, the decision tree was used for feature selection.
The final step in the machine learning approach is feature classification.An expert system was developed using ANN to predict the tool wear in turning and milling tool (Silva, Reuben, Wilcox, 1998), (Ghosh, Ravi, Patra, Mukhopadhyay, Paul, Mohanty & Chattopadhyay, 2005).Chen and Jen (2000) suggested a data fusion neural network model for monitoring the condition of a milling cutter.Elangovan, Babudevasenapati, Sakthivel, Ramachandran, 2011) developed an expert system for condition monitoring of a single point cutting tool using decision tree algorithm.In another study, K-Star algorithm has been proposed for tool condition monitoring using statistical features (Sanidhya, Elangovan & Sugumaran, 2014).In a research, support vector machine algorithm using statistical features was studied for monitoring the condition of a single point cutting tool (Elangovan, Babudevasenapati, & Ramachandran, 2009).Several machine learning algorithms like a best first tree (Jegadeeshwaran, &Sugumaran, 2013, proximal support vector machines (Saimurugan, Ramachandran, Sugumaran, andSakthivel, 2011), were reported for achieving better results in various condition monitoring study.However, there is a limited study over the condition monitoring of insert fitted tool.Das, Roy, and Chttopadhyay (1996) used ANN model for predicting wear on carbide inserts.An experimental study was conducted to measure tool wear and the cutting force variations in the end milling of Inconel 718 with coated carbide insert (Li, Zeng & Chen, 2006).However, the literature for carbide insert fitted tool health monitoring using machine learning is almost nil.Hence, in this study, the carbide insert condition monitoring has been performed using the machine learning algorithms like decision tree and random tree.The randomization can be improved by a random tree in which the base learner randomly chooses both the feature on which to split and the split itself.Since it does not optimize over either the feature or the location of the split, it is very easy to code and very fast to fit.Buntine and Niblett (1992) studied the possibilities of using the random tree for the fault diagnosis study.The time taken to build classifier model is more compared to decision tree algorithm.Hence, there is limited study using a random tree.However, the random tree produced better classification results than the decision tree algorithm.
Contributions in the present work are the following: 1.The procedure for acquiring the vibration signal under various fault condition has been explained.2. From the vibration signals, a set of statistical features were extracted.3. The contributing features were selected using a decision tree.4. J48 and Random tree were used as a classifier.All the algorithms were trained and the results were compared.
The results show the effectiveness of the features that were extracted features from the acquired vibration signals.

EXPERIMENTAL STUDY
In this paper, an attempt was made to apply machine learning technique to predict the tool health using the vibration signals.Figure 1 shows the experimental setup used for acquiring the vibration signals.No. of samples : 67 (Arbitrarily chosen) The experiment was conducted in two phases.In the first phase, the insert was in a good condition.The vibration signals for each parameter were acquired while other two parameters were constant.The corresponding vibration signals were recorded.Figure 3 shows the experimental procedure for acquiring the vibration signals.Under each set of parameters, the predictability of the classifier model was tested.The parameters under which the maximum accuracy was obtained were selected for the fault diagnosis study.In the fault diagnosis study, under each fault conditions, the relevant vibrations signals were acquired with the selected parameters in phase 1.The extracted vibration signal was processed using the machine learning approach.

MACHINE LEARNING APPROACH
As discussed earlier, machine learning approach consists three basic steps: (i) Feature extraction; (ii) Feature Selection; (iii) Feature classification.

Feature Extraction
Feature extraction is a process of extracting informative and non-redundant data from a set of largely measured values.These features represent the data measured in a more informative way and are helpful in further analyzing of the required information.The information contained in the signals may be in the form of statistical and/or in the form of histogram features.The statistical information like sample variance, standard error, kurtosis, skewness, minimum, standard deviation, maximum, count, mean, median, mode, and sum are extracted from the raw vibration signals under each conditions using a suitable feature extraction technique.These features were extracted using a visual basic code from excel.

Feature Selection
Feature selection was done in two ways: (i) Decision tree; (ii) Effect of a number of features study.In this study, decision tree generated from both the algorithms were used for feature selection.All the extracted features were fed as input to the algorithm.The output was a decision tree as shown in Figure 6 and Figure 7.The contributing features were identified from the decision tree using a top-down approach.Referring the Figure 6, only six features were contributing to classification using J48.The following six features were selected for classification.Minimum, mean, range, kurtosis, sample variance, standard error.Based on its contribution, the order of features was selected for classification.Referring the Figure 7, the top eight features were contributing to classification using a random tree.The following eight features were selected for classification.Kurtosis, minimum, median, standard deviation, skewness, sample variance, maximum, and range were selected for the classification.Based on its contribution, the order of features was selected for classification.

Feature Classification
Classification is assigning the category to the new set of observations by comparing with the already established data set whose category membership is known.An algorithm that implements the classification is called classifier.The classifiers used for the above are J48 decision tree and random tree classifiers.

Feature classification using J48 algorithm
A decision tree is a tree based knowledge representation methodology used to represent classification rules.Decision tree learning is one of the most popular learning approaches in classification because it is fast and produces models with good performance.Generally, decision tree algorithms are especially good for classification learning if the training instances have errors (i.e.noisy data) and attributes have missing values.A decision tree is an arrangement of tests on attributes in internal nodes and each test leads to the split of a node.Each terminal node is then assigned a classification.
A standard tree consists of a number of branches, one root, nodes, and leaves.One branch is a chain of nodes from root to a leaf, and each node involves one attribute.The occurrence of an attribute in a tree provides the information about the importance of the associated.A decision tree is a tree based knowledge representation methodology used to represent classification rules.The J48 decision tree algorithm is a widely used one to construct decision trees (Figure 6).The procedure of forming the Decision Tree and exploiting the same for feature selection is characterized by the following: 1.The selected set of statistical features was given as input to the algorithm; the output from the algorithm is the decision tree.2. The decision tree has leaf nodes which represent class labels and other nodes associated with the classes being classified.3. The branches of the tree represent each possible value of the feature node from which they originate.4. The decision tree can be used to classify feature vectors by starting at the root of the tree and moving through it until a leaf node which provides a classification of the instance is identified.

Random tree algorithm
The random tree algorithm can deal with both classification and regression problems.The random tree is a collection (ensemble) of tree predictors called a forest.The classification works as follows: the random trees classifier takes the input feature vector, classifies it with every tree in the forest, and outputs the class label that received the majority of -votes‖.In the case of a regression, the classifier response is the average of the responses of all the trees in the forest.All the trees were trained with the same parameters but on different training sets.These sets were generated from the original training set using the following bootstrap procedure: 1.The same number of vectors was chosen randomly with a replacement for each training set.Some vectors will occur more than once and some will be absent.2. At each node of each trained tree, a random subset of the variables was used to find the best split instead of using all the features.With each node, a new subset was generated with fixed size.
In random trees, there is no need for any accuracy estimation procedures, such as cross-validation or bootstrap, or a separate test set to get an estimate of the training error.The error is estimated internally during the training.When the training set for the current tree is drawn by sampling with replacement, some vectors are left out (out of bag (OOB) data).The classification error is estimated by using this oob-data.

RESULTS AND DISCUSSION
Condition monitoring of Carbide insert was studied using machine learning technique.The selected classifier models were tested for finding the prediction accuracy.

Parameter prediction
The experiment was conducted under five different speed conditions, five feeds and three depths of cut.The insert was kept in good condition, and the vibration signals under each parameter combinations were recorded.From the acquired vibration signals, the statistical features were extracted.The extracted features were classified using J48 decision tree algorithm and the random tree classifier.The input to the algorithms is statistical features.The output will be the classification accuracy as shown in Table 1.

Effect of number of features study
The results were obtained using all the features (Table 2).
All the extracted features may not be required for the classification.Hence the feature selection process was carried out using the decision tree and the effect of a number of features study.Initially, the decision tree was generated under the predicted parameter combinations.Based on the order obtained from the decision tree, the classification accuracy was found.Initially, the top feature alone was selected and was classified using the both decision tree and random tree.The corresponding classification accuracy was noted down as shown in Table 2.

S.
No. The second feature from the decision tree was clubbed with the first feature and the combination was classified using the decision tree algorithm.The third feature was clubbed with the previous combination and was classified using the decision tree algorithm.The same procedure was repeated until all the feature combinations were classified.Table 2 shows the effect a number of features study.Referring Table 3, J48 produces maximum classification accuracy with both 6 and 11 features, whereas, the random tree produces the maximum classification accuracy with 11 features.

Feature classification using J48 decision tree algorithm
From the decision tree (Figure 6) the top six features were selected for the classification.The same was verified with the effect of a number of features study (Table 2).The selected features are minimum, mean range, kurtosis, sample variance, standard error.The selected features alone were classified using the J48 decision tree algorithm.The classification accuracy was presented in the form confusion matrix as shown in Table 3.

Table 3. Confusion matrix -J48 Decision Tree algorithm
The confusion matrix is a square matrix in which the summary of the classification accuracy can be found.The diagonal elements in the confusion matrix are the correctly classified data points and the non-diagonal elements are misclassified data points.In the confusion matrix, the first row represents the data points belong to good condition.The first element in the first column belongs to its classified state.Out of 67 data points, 66 were correctly classified as Good.One data has been misclassified as Thermal wear.
The second row represents flank wear.The second element in the second column is a number of data points that are correctly classified.Out of 67 data points, 64 data points were correctly classified.The three data points were misclassified as thermal wear.As discussed above the classification and the misclassification details can be studied using the confusion matrix.The accuracy of the individual class can be studied using the detailed accuracy by class.Table 4 shows the detailed accuracy by class.Among the 258 data points belong to all fault conditions, 14 data points were misclassified.The overall accuracy of the J48 decision tree classifier is found to be 94.78 %.This classification result was obtained through a 10-fold cross validation process.
In machine learning, the classification accuracy of a model is mostly affected by the over-fitting criteria.Over-fitting normally will occur when the data set is too small and the number of parameters in the model is large.The over-fitting criteria was reduced partially by enabling the methods like Leave-one-out-cross validation (LOOCV), early stopping (Qi et al., 2004), regularisation (Cawley andTalbot, 2007), hyper-parameter averaging (Hall and Robinson, 2009).
Recent study reported k-fold cross validation have been suggested for reducing the over-fitting problems (2010).In this study also a 10-fold cross validation has been used to overcome the over-fitting problems.
In 10-fold cross-validation, the original sample is randomly partitioned into 10 equal size sub-samples.Out of the 10 subsamples, a single subsample is retained as the validation data for testing the model, and the remaining 9 subsamples are used as training data.The cross-validation process is then repeated 10 times (the folds), with each of the 10 subsamples used exactly once as the validation data.The 10 results from the folds can then be averaged to produce a single estimation.The advantage of this method is that all observations are used for both training and validation, and each observation is used for validation exactly once.The summary of the classification study has been in Table 5.
Referring the confusion matrix, none of the fault conditions were misclassified as good condition.Hence J48 can be used for the fault diagnosis study.The built decision tree model can also be tested using the unseen data.In this process, in each condition, out of 67 data 50 data was used for training and 17 data was used as unseen data (test data for validation).Including all conditions, 200 data was used for training and 68 data was used for testing.Table 6 shows the confusion matrix obtained for the test data.Among the 68 unseen data, 64 data were correctly classified.Hence, the overall classification accuracy was found to be 94.11 %.Even though the process is same, the 10-fold cross-validation gives a better result.The features extracted from the vibration signals were used to generate a decision tree from the random tree algorithm.
From the decision tree, the following features that contribute for classification were only selected for classification: kurtosis, minimum, median, standard deviation, skewness, sample variance, maximum, and range.The same has been verified using the effect of a number of features study (Table 2).The selected features were classified using the random tree algorithm.The classification accuracy was presented as a confusion matrix as shown in Table 7.The classification accuracy was found using the 10-fold cross-validation process.The summary of the classification accuracy has been given in Table 9.The trained model was tested using the test data set.Out of 68 data points, 66 data points were correctly classified and the overall classification accuracy was found to be 97.06 %.
Table 10 shows the confusion matrix obtained for the test data.Comparatively, the 10-fold cross-validation produced a better result.

Comparative study
For predicting the insert condition monitoring, two algorithm models were selected for diagnosing the faults (Table 11).The decision tree algorithm produces 94.78 % accuracy whereas the random tree produced 98.13 %. the tool inserts must be monitored continuously for obtaining better and reliable products.In this scenario, the random tree can be used for obtaining the better Classification accuracy than decision tree.

CONCLUSION
In this study, J48 decision tree algorithm and random tree classifier were used to study the condition monitoring of carbide insert tool with the help of acquired vibration signals.From the vibration signals, twelve set of statistical features were extracted.Using the decision tree, the contributing features were selected.The selected features were classified using the J48 decision tree algorithm and random tree classifier.The accuracy of the J48 algorithm was found to be 94.78 % while random tree classifier estimated an accuracy of 98.13%.The features, unless the data is in abundance, will not cause a problem.Hence considering the above study Random tree classifier can be used to study the carbide insert tool condition monitoring.
The above study can be extended on uncoated inserts to make a comparison which will help us understand the behavior of the algorithm.

Figure 3 .
Figure 3. Experimental Procedure Figure 4 and Figure 5 shows the sample vibration signals acquired from the setup.

Table 1 .
The effect of number of features From the Table1, it is seen that J48 classifier has shown the most corrected classified instance for Speed 770 and feed 0.06.Hence, these parameters (Speed 770 rpm; Feed 0.06 mm/rev) were selected for the condition monitoring study.

Table 2 .
Effect of number of features

Table 4 .
Detailed accuracy by class -J48 Decision Tree

Table 8 .
Detailed accuracy by class -Random Tree

Table 9 .
Summary of the classification accuracy -Random tree algorithmThe random tree model was also tested with unseen data.The model was trained using the training data.68 new data points were considered for the model testing.

Table 10 .
Confusion matrix -Random tree algorithm

Table 11 .
Comparative Results