Improvement of the R-FCN's deep network in object detection and annotation

Document Type : Research Paper


1 PhD Student of Computer Engineering and IT, Shahrood University of Technology

2 Faculty of Computer Eng., Shahrood University of Technology


Today, the detection and annotation of objects in images is one of the major challenges in some applications of machine vision. In recent years, the use of deep learning has attracted the attention of researchers. In this regard, this paper first introduces the newest deep networks and analyzes the strengths and weaknesses of these methods. An improved network of R-FCN network has been presented. The proposed method is based on the ResNet architecture and the fully- convolutional network. In this method, a new architecture is proposed based on region proposal deep network and a combined method based on the binary fuzzy SVM and the SVR for final detection and categorization of objects. Also, a new loss function called Cauchy-Schwartz Divergence loss, has been used. This function has shown better performance in terms of speed and accuracy. The proposed ResNet-101 architecture was tested on the SUN dataset for the detection and annotation of 36 objects, and the results indicate improved performance of this method compared to the basic R-FCN network method. The proposed method, In terms of Mean Average Precision, has 48.38% performance and average duration for each image is 0.13 Compared to the best method in this area, it performed about 2% in performance and 0.04 seconds in better time.