A NanoDet Model with Adaptively Weighted Loss for Real-time Railroad Inspection
##plugins.themes.bootstrap3.article.main##
##plugins.themes.bootstrap3.article.sidebar##
Abstract
Monitoring the railroad’s components is crucial to maintaining the safety of railway operations. This article proposes a novel, compact computational vision system that works on edge devices, designed to provide precise, instantaneous assessments of rail tracks. This model reconfigures the teacher-student guidance system inherent in NanoDet by incorporating an innovative adaptively weighted loss (AWL) in the learning phase. The AWL assesses the caliber of the teacher and student models, establishes the weightage of the student's loss, and dynamically adjusts their loss contributions, directing the learning procedure towards effective knowledge transfer and direction. In comparison with cutting-edge models, our AWL-NanoDet boasts a minuscule model size of less than 2 MB and a computational expense of 1.52 G FLOPs, delivering a processing time of less than 14 ms per frame (evaluated on Nvidia’s AGX Orin). Compared to the original NanoDet, it also significantly enhances the model's accuracy by nearly 6.2%, facilitating extremely precise, instantaneous recognition of rail track elements.
How to Cite
##plugins.themes.bootstrap3.article.details##
Deep Learning, Rail inspection, Computer vision, Edge Computing
FRA, “Train accidents by cause form (form FRA F 6180.54).” https://safetydata.fra.dot.gov/OfficeofSafety/publicsite/Query/inccaus.aspx.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, pp. 2278–2324, 1998.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Commun ACM, vol. 60, pp. 84–90, 2017.
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint, 2014.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in IEEE conference on computer vision and pattern recognition, 2014, pp. 580–587.
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.
H. Law and J. Deng, “Cornernet: Detecting objects as paired keypoints,” in European conference on computer vision, 2018, pp. 734–750.
Z. Tian, C. Shen, H. Chen, and T. He, “Fcos: Fully convolutional one-stage object detection,” in IEEE/CVF international conference on computer visio, 2019, pp. 9627–9636.
A. G. Howard et al., “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint, 2017.
X. Zhang, X. Zhou, M. Lin, and J. Sun, “Shufflenet: An extremely efficient convolutional neural network for mobile devices,” in IEEE conference on computer vision and pattern recognition, 2018, pp. 6848–6856.
G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv preprint, 2015.
C. H. Nguyen, T. C. Nguyen, T. N. Tang, and N. L. Phan, “Improving object detection by label assignment distillation,” in IEEE/CVF Winter Conference on Applications of Computer Vision, 2022, pp. 1005–1014.
D. Dais, I. E. Bal, E. Smyrou, and V. Sarhosis, “Automatic crack classification and segmentation on masonry surfaces using convolutional neural networks and transfer learning,” Autom Constr, vol. 125, p. 103606, 2021.
J. Zhang, S. Qian, and C. Tan, “Automated bridge crack detection method based on lightweight vision models,” Complex & Intelligent Systems, pp. 1–14, 2022.
Y.-J. Cha, W. Choi, G. Suh, S. Mahmoudkhani, and O. Büyüköztürk, “Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types,” Computer-Aided Civil and Infrastructure Engineering, vol. 33, pp. 731–747, 2018.
F. Guo, Y. Qian, and Y. Shi, “Real-time railroad track components inspection based on the improved YOLOv4 framework,” Autom Constr, vol. 125, p. 103596, 2021.
F. Guo, Y. Qian, Y. Wu, Z. Leng, and H. Yu, “Automatic railroad track components inspection using real-time instance segmentation,” Computer-Aided Civil and Infrastructure Engineering, vol. 36, pp. 362–377, 2021.
S. Li, X. Zhao, and G. Zhou, “Automatic pixel-level multiple damage detection of concrete structure using fully convolutional network,” Computer-Aided Civil and Infrastructure Engineering, vol. 34, pp. 616–634, 2019.
https://cocodataset.org/#home.
https://docs.ultralytics.com.
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
The Prognostic and Health Management Society advocates open-access to scientific data and uses a Creative Commons license for publishing and distributing any papers. A Creative Commons license does not relinquish the author’s copyright; rather it allows them to share some of their rights with any member of the public under certain conditions whilst enjoying full legal protection. By submitting an article to the International Conference of the Prognostics and Health Management Society, the authors agree to be bound by the associated terms and conditions including the following:
As the author, you retain the copyright to your Work. By submitting your Work, you are granting anybody the right to copy, distribute and transmit your Work and to adapt your Work with proper attribution under the terms of the Creative Commons Attribution 3.0 United States license. You assign rights to the Prognostics and Health Management Society to publish and disseminate your Work through electronic and print media if it is accepted for publication. A license note citing the Creative Commons Attribution 3.0 United States License as shown below needs to be placed in the footnote on the first page of the article.
First Author et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.