Voltage-Based Physical Layer Fault Diagnosis for Controller Area Network

Controller Area Network (CAN) is the most prevalent communication protocol used in the automotive industry. This in-vehicle network provides a means communication between Electronic Control Units (ECUs) and components within the vehicle. The recent rapid development of connected, electric, and autonomous vehicles expands the complexity and information exchange within CAN and demands an increase in the reliability of the network. Efficient system-level diagnosis functions need to be integrated over the network to ensure for reliability and enhance the ease of troubleshooting. 
This paper presents a method to identify physical CAN faults such as loss of electrical connections and shorted wires. Fault signatures of predefined physical CAN faults are used to detect and identify the failure modes. The method can identify both permanent and intermittent faults caused by, for instance, damaged connectors and vibrations, respectively. 
Diagnosis tasks are implemented on in-vehicle module by measuring and processing physical layer voltages of all CAN buses. A real-time data buffer of a predefined size is utilized to calculate health indicators from the physical layer CAN voltages. The health indicators are then compared to predefined thresholds to determine the presence and type of the fault. Compared to ground truth data, the results show that the presented method can identify with high accuracy physical CAN faults including open electrical connection and shorted wires.


INTRODUCTION
Controller Area Network (CAN) communication networks are widely used for in-vehicle communications (ISO, 1993). Different types of physical faults can occur in CAN communication networks such as single or dual CAN wire open, CAN wire short (i.e., short between CAN Hi and CAN Lo, and shorted to power or ground). The correct detection and diagnosis of CAN bus faults is critical for the successful control of the vehicle. Different approaches have been developed for CAN bus fault detection and diagnosis (For example, see Xiao and Lei (2013), Mary, Alex, and Jenkins (2013), Asaduzzaman, Bhowmick, and Moniruzzaman (2014), Farsi, Ratcliff, and Barbosa (1999), Robertson (2014), Hu and Qin (2011), Kelkar and Kamal (2014), Lei, Yuan, and Zhao (2014), Wheeler, Timucin, Twombly, Goebel, and Wysocki (2007), Furse, Smith, Safavi, and Lo (2005), Furse, Chung, Lo, and Pendalaya (2006)).
One of the main CAN bus fault detection and diagnostic approaches is based on CAN message monitoring, such as the typical method called signal supervision that is used in several production vehicles (Furse, Smith, Safavi, and Lo (2005)). In order to detect any communication fault, the method of signal supervision is usually adopted at the receiver side. For example, suppose a signal A is sent out periodically by some sender ECU X every T time unit. If the receiver ECU Y does not receive any updated signal A for N*T time units, then ECU Y can declare a loss of signal A from ECU X, in which case a Diagnostic Trouble Code (DTC) U-code could be set by the receiver Y to indicate the loss of communication with the sender ECU X. In the above, N is a calibratable number (normally N=2.5, but can be high for robustness), and is used to exclude some transient faults from communication jitter or random noises.
The above DTC U-codes have some limitations. First, a DTC U-code only indicates the loss of communication between ECUs that are designed to transmit/receive direct messages between them. It cannot tell where the fault is, nor what type of physical fault it is. Second, for intermittent faults, when a fault occurs, the corresponding DTC U-code will be set with the status of "current". After the fault disappears and the system recovers, the status of the DTC U-code will be changed to "history". However, for the current DTC U-codes there is no time info to indicate when the fault occurred and when the fault recovered. Third, a fault in the communication network may result in multiple DTC U-codes set by multiple ECUs at separate times at which multiple ECUs point to communication faults at different ECUs. Fourth, for the dualwire high speed CAN bus, a single wire open fault, either CAN Hi or CAN Lo wire open fault, on the CAN bus, which separates the network into two disconnected segments, may result in a DTC U-code for a sender ECU that is on the same segment as the receiver ECU as described in Furse et al. (2006).
To overcome the above limitations of DTC U-codes, an integrated software-based approach was developed in Furse et al. (2006) based on both message monitoring and system topology for detection and localization of the CAN communication faults. However, due to the limitations of message monitoring, the approach developed in Furse et al. (2006) cannot isolate different fault types such as CAN Hi/Lo single wire-open, shorted to power/ground, or wire short between CAN Hi and Lo. This is because those faults have the same symptoms by message monitoring. Therefore, they are indistinguishable by message monitoring alone.
Besides the message monitoring approaches, there are also physical signal measurement-based approaches, using voltage, current, and/or bit-time measurements (see Farsi et al. (1999), Robertson (2014), Hu andQin (2011), Kelkar andKamal (2014), Lei et al. (2014), Wheeler et al. (2007)). Some of those approaches monitor the bus passively, while others actively send out inquiries. Although those approaches have different successes for different cases, none of them provides a cost effective and accurate method for the diagnosis of different CAN bus wire faults.
In this paper, a new voltage-based approach for the diagnosis of CAN communication faults is presented. The main idea is to monitor both CAN Hi and Lo voltages, and diagnose bus wire faults based on the voltage patterns. The proposed approach has been implemented in an on-board ECU and tested using actual vehicle data. The test results show that the approach can successfully detect and diagnose different CAN bus wire faults.

PHYSICAL CAN BUS WIRE
The physical layer characteristics for the CAN bus are specified in ISO-11898-2. The physical layer of a high-speed CAN bus consists of pair wires, terminators at both side, and ECUs equipped with the CAN transceiver to receive and transmit messages. The schematic of a typical high-speed CAN bus is shown in Figure 1 where the bus has n-ECUs and split termination. Parallel wires with a nominal impedance of 120 Ω (95 Ω minimum and 140 Ω maximum) are normally utilized. A maximum length of 40 meters is specified for CAN at a data rate of 1 Mb (ISO, 1993)  At lower data rates, longer wires are possible as well. The two signal lines of the bus are called CAN Hi and CAN Lo. In the recessive state, the bus voltages are equal to 2.5 V. The dominant state on the bus typically drives the CAN Hi up to 3.5 V, and CAN Lo down to 1.5 V, creating a 2 V differential signal. Figure 2 shows the normal CAN voltage profile with sampling rate of 10 msec. The bus terminations are placed at each end of the bus. Each terminator includes two resistors of approximately 60 Ω each and a coupling capacitor which couples high-frequency noise to a solid ground potential.
With the recent expansion of utilizing CAN for vehicular communication between devices, the CAN bus bandwidth usage became higher which led to migrate the network to CAN with flexible data rate (CAN FD). Classical CAN network would only allow 8 data bytes with data transfer speed of up to 1 MB/s. CAN FD bandwidth can support 5 MB/s with up to 64 data bytes in a single frame (Zago and Freitas (2018)).  (Jiang, Du, and Wienckowski, (2015), Jiang, Du, and Nagose, (2015)).

CAN BUS WIRE FAULTS
CAN Hi and CAN Lo voltages are measured and processed on-board using the Central Gateway Module as the monitoring ECU. The voltage range of the measurements is between 0 V (lower limit) and 5 V (upper limit). If the measured CAN bus voltage is greater than the upper limit, a ceiling value of 5 V is applied by the monitoring ECU. The measurements are taken at a sampling rate of 10 Hz and stored in buffers of 250 samples each for processing. Once the buffer is filled, health indicators are calculated to determine whether a physical layer fault is present on the CAN bus.    Hi and CAN Lo voltages in the dominant state are slightly higher than 3.5 V and lower than 1.5 V, respectively. However, the overall measured voltage profile is similar to that without faults. As a result, identification of this type of fault is challenging from voltage measurements alone.

VOLTAGE-BASED PHYSICAL CAN WIRE FAULT DIAGNOSTICS APPROACH
This section presents the proposed approach to diagnose physical CAN wire using voltage measurements. The voltage data is continuously collected even when the data is being processed. Therefore, this approach requires two sets of buffers: one for the data being processed and one for the data being collected. The flowchart of the presented method is shown in Figure 11. The sequential style is selected for this method to isolate the fault state as multiple fault state conditions can be met at same time. Therefore, the priority is assigned to the most estimated confident state. Moreover, the sequential style has the benefit on saving memory as evaluation the rest of the fault states is not required once the highest priority fault state is determined.  Figure 11. CAN fault state determination logic.

CAN Hi Shorted to Ground
The algorithm is triggered after the vehicle is powered on by sometimes (e.g. 5 seconds) to allow the ECU communications to be stabilized. The presented logic is described in detail as follows.
For each CAN voltage pair within the buffer, perform the followings: A. Count the number of times CAN Lo is greater than the recessive voltage level by a predefined threshold (k_single_open_threshold After the counters have been calculated for each buffer, the fault state is estimated based on the logic in Figure 11. Then, the decision will be made based on the state priority as shown in Figure 11. Therefore, the highest priority state will be reported, and the remaining states do not need to be evaluated. If none of the fault states is detected, the logic will report the CAN bus state as healthy.
The CAN physical layer fault diagnostic algorithm was calibrated using CAN bus voltage data from several production vehicles. The data was collected under normal as well as fault injected conditions. The default calibration values are listed in Error! Reference source not found.. Value  Unit   k_single_open_threshold  324  mV  k_short_ground_threshold  646  mV  k_short_power_threshold  4500  mV  k_double_open_delta_threshold  2100  mV  k_dom_threshold  1474  mV  k_imt_short_percentage  5  Percentage  k_short_percentage  90  Percentage  k_open_count_threshold  2  Counter  k_dual_open_count_threshold  7  Counter  buffer_size 250 Counter  Table 2. Validation results of the proposed method

CONCLUSIONS
This paper presents a method to identify CAN physical fault types. The method relies on raw CAN voltage measurements to classify the CAN voltage profiles. Performance of the method was evaluated using actual vehicle data. The results showed that the proposed approach can provide with high accuracy identification of physical CAN fault types. The method was calibrated and validated using over 8000 sets of vehicle data. It is recommended to expand this approach to cover other types of physical CAN fault as well as to investigate the potential use of proposed method for other communication protocols such as Ethernet.