A novel architecture of YOLOv5 network using attention mechanism to improve accuracy-speed trade-off of apple fruit detection

Document Type : Research Paper

Authors

1 Graduated of Electrical Engineering, Dept. of Engineering, Imam Khomeini International University

2 Electrical Engineering Department, Faculty of Engineering and Technology, Imam Khomeini International University, Qazvin, Iran

Abstract

Fruit detection due to different lighting conditions, occlusion and overlap is a challenging task in machine vision-based harvesting robots. The aim of this article is to improve the accuracy-speed trade-off in apple fruit detection in the vision system of agricultural harvesting robots. Considering the recent applications of attention modules in the field of object detection, we have proposed a new architecture of YOLOv5 network in which the ECA channel attention module is replaced by the C3 module in the backbone of the network. Despite reducing the number of network parameters, the ECA module has not had a significant effect on the detection efficiency, and by increasing the speed by 22% compared to the YOLOv5 Nano version, it has been able to establish a better trade-off between accuracy and speed. To evaluate the proposed architecture, three datasets KFuji, MinneApple and ACFR are used in the training and testing phase, and in the case that the training and testing databases are not the same, the transfer learning method is used to improve the test results. In the case where the training and test data are the same, the use of the proposed architecture leads to a relative improvement of the trade-off by 21.2% compared to the C3 module. In the case of transfer learning where the training and test data are not the same, a relative improvement of 18% in the trade-off has been achieved.

Keywords