Journal of Systems Engineering and Electronics ›› 2019, Vol. 30 ›› Issue (1): 1-12.doi: 10.21629/JSEE.2019.01.01
• Electronics Technology • Previous Articles Next Articles
Baojun ZHAO1,2(), Boya ZHAO1,2(), Linbo TANG1,2,*(), Wenzheng WANG1,2(), Chen WU1,2()
Received:
2018-05-08
Online:
2019-02-27
Published:
2019-02-26
Contact:
Linbo TANG
E-mail:zbj@bit.edu.cn;zhaoboya@bit.edu.cn;tanglinbo@bit.edu.cn;wwz@bit.edu.cn;wuchen@gmail.com
About author:
ZHAO Baojun was born in 1960. He received his Ph.D. degree in electromagnetic measurement technology and equipment from Harbin Institute of Technology (HIT), Harbin, China, in 1996. From 1996 to 1998, he was a postdoctoral fellow at Beijing Institute of Technology (BIT), Beijing, China. Since 1998, he has been engaged in teaching and research work at Radar Research Laboratory, BIT. His main research interests include image/video coding, image recognition, infrared/laser signal processing, and parallel signal processing. E-mail:Supported by:
Baojun ZHAO, Boya ZHAO, Linbo TANG, Wenzheng WANG, Chen WU. Multi-scale object detection by top-down and bottom-up feature pyramid network[J]. Journal of Systems Engineering and Electronics, 2019, 30(1): 1-12.
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
Table 1
Anchor details for each feature map"
Feature | Height | Width | Number |
Conv3_3 | 0.035 4, 0.028 9, 0.070 7 | 0.070 7, 0.086 6, 0.035 4 | 33 750 |
0.866 0, 0.050 0, 0.070 7 | 0.028 9, 0.050 0, 0.070 7 | ||
Conv4_3 | 0.070 7, 0.057 7, 0.141 4 | 0.141 4, 0.173 2, 0.070 7 | 8 864 |
0.173 2, 0.100 0, 0.141 4 | 0.057 7, 0.100 0, 0.141 4 | ||
Conv6_2 | 0.141 4, 0.115 5, 0.282 8 | 0.282 8, 0.346 4, 0.141 4 | 2 166 |
0.346 4, 0.200 0, 0.278 4 | 0.115 5, 0.200 0, 0.278 4 | ||
Conv7_2 | 0.274 0, 0.223 7, 0.548 0 | 0.548 0, 0.671 2, 0.274 0 | 600 |
0.671 2, 0.387 5, 0.472 0 | 0.223 7, 0.387 5, 0.472 0 | ||
Conv8_2 | 0.406 6, 0.332 0, 0.813 20.995 9, 0.575 0, 0.662 1 | 0.813 2, 0.995 9, 0.406 60.332 0, 0.575 0, 0.662 11.078 3, 1.320 7, 0.539 2 | 150 |
Conv9_2 | 0.539 2, 0.440 2, 1.078 3 | 1.078 3, 1.320 7, 0.539 2 | 54 |
1.320 7, 0.762 5, 0.851 1 | 0.440 2, 0.762 5, 0.851 1 | ||
Conv10_2 | 0.671 8, 0.548 5, 1.343 5 | 1.343 5, 1.645 4, 0.671 8 | 6 |
1.645 4, 0.950 0, 1.039 5 | 0.548 5, 0.950 0, 1.039 5 |
Table 2
Details of confidence and location kernels"
Feature | Confidence kernel | Location kernel %number |
Conv3_3 | ||
Conv4_3 | ||
Conv6_2 | ||
Conv7_2 | ||
Conv8_2 | ||
Conv9_2 | ||
Conv10_2 |
Table 3
Detection results for each class %"
Class | Faster R-CNN | ION | RFCN | SSD 300 | MR-CNN | TDBU-FPN |
Aeroplane | 76.5 | 79.2 | 79.0 | 81.0 | 80.3 | 82.6 |
Bicycle | 79.0 | 83.1 | 80.3 | 84.2 | 84.1 | 84.5 |
Bird | 70.9 | 77.6 | 76.6 | 76.7 | 78.5 | 78.6 |
Boat | 65.5 | 65.6 | 67.0 | 72.1 | 70.8 | 75.9 |
Bottle | 52.1 | 54.9 | 63.7 | 51.7 | 68.5 | 61.5 |
Bus | 83.1 | 85.4 | 84.8 | 86.1 | 88.0 | 85.5 |
Car | 84.7 | 85.1 | 85.6 | 86.1 | 85.9 | 86.9 |
Cat | 86.4 | 87.0 | 89.1 | 85.0 | 87.8 | 85.5 |
Chair | 52.0 | 54.4 | 62.2 | 63.0 | 60.3 | 64.1 |
Cow | 81.9 | 80.6 | 85.3 | 82.0 | 85.2 | 81.6 |
Dining_table | 65.7 | 73.8 | 67.9 | 76.9 | 73.7 | 78.1 |
Dog | 84.8 | 85.3 | 87.3 | 85.5 | 87.2 | 86.7 |
Horse | 84.6 | 82.2 | 86.6 | 87.3 | 86.5 | 88.5 |
Motorbike | 77.5 | 82.2 | 82.8 | 84.8 | 85.0 | 85.2 |
Person | 76.7 | 74.4 | 79.0 | 78.8 | 76.4 | 50.1 |
Potted_plant | 38.8 | 47.1 | 51.0 | 50.4 | 48.5 | 58.1 |
Sheep | 73.6 | 75.8 | 77.6 | 77.2 | 76.3 | 78.8 |
Sofa | 73.9 | 72.7 | 75.2 | 80.2 | 75.5 | 78.7 |
Tvmonitor | 83.0 | 84.2 | 83.5 | 87.6 | 85.0 | 88.5 |
Train | 72.6 | 80.4 | 76.5 | 76.2 | 81.0 | 77.8 |
mAP | 73.2 | 75.6 | 77.0 | 77.2 | 78.2 | 79.0 |
1 | LOWE D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60 (2): 91- 110. |
2 | DALAL N, TRIGGS B. Histograms of oriented gradients for human detection. Proc. of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, 886- 893. |
3 | RÄTSCH G, ONODA T, MÜLLER K R. Soft margins for Ad-aBoost. Machine Learning, 2001, 42 (3): 287- 320. |
4 | BREIMAN L. Random forests. Machine Learning, 2001, 45 (1): 5- 32. |
5 |
SUYKENS J A K, VANDEWALLE J. Least squares support vector machine classifiers. Neural Processing Letters, 1999, 9 (3): 293- 300.
doi: 10.1023/A:1018628609742 |
6 | FELZENSZWALB P, MCALLESTER D, RAMANAN D. A discriminatively trained, multiscale, deformable part model. Proc. of the IEEE International Conference on Computer Vision and Pattern Recognition, 2008, 1- 8. |
7 | ZITNICK C L, DOLLÁR P. Edge boxes:locating object proposals from edges. Proc. of the European Conference on Computer Vision, 2014, 391- 405. |
8 | UIJLINGS J R R, VAN DE SANDE K E A, GEVERS T, et al. Selective search for object recognition. International Journal of Computer Vision, 2013, 104 (2): 154- 171. |
9 | RUSSAKOVSKY O, DENG J, SU H, et al. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 2015, 115 (3): 211- 252. |
10 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks. Proc. of the Advances in Neural Information Processing Systems, 2012, 1097- 1105. |
11 |
LECUN Y, BOSER B, DENKER J S, et al. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1989, 1 (4): 541- 551.
doi: 10.1162/neco.1989.1.4.541 |
12 |
RUMELHART D E, HINTON G E, WILLIAMS R J. Learning representations by back-propagating errors. Nature, 1986, 323 (6088): 533.
doi: 10.1038/323533a0 |
13 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, 580- 587. |
14 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition. Proc. of the International Conference on Learning Representations, 2015, 1- 14. |
15 | GIRSHICK R. Fast R-CNN. Proc. of the IEEE International Conference on Computer Vision and Pattern Recognition, 2015, 1440- 1448. |
16 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN:towards real-time object detection with region proposal networks. Proc. of the Advances in Neural Information Processing Systems, 2015, 91- 99. |
17 | REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once:unified, real-time object detection. Proc. of the IEEE International Conference on Computer Vision and Pattern Recognition, 2016, 779- 788. |
18 | LIU W, ANGUELOV D, ERHAN D, et al. SSD:single shot multibox detector. Proc. of the European Conference on Computer Vision, 2016, 21- 37. |
19 | FU C Y, LIU W, RANGA A, et al. DSSD:deconvolutional single shot detector. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1- 11. |
20 | SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 1- 9. |
21 | BELL S, LAWRENCE ZITNICK C, BALA K, et al. Insideoutside net:detecting objects in context with skip pooling and recurrent neural networks. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 2874- 2883. |
22 | DAI J, LI Y, HE K, et al. R-FCN:object detection via regionbased fully convolutional networks. Proc. of the Advances in Neural Information Processing Systems, 2016, 379- 387. |
23 | HONG S, ROH B, KIM K H, et al. PVANet:lightweight deep neural networks for real-time object detection. Proc. of the Conference and Workshop on Neural Information Processing Systems, 2016, 1- 7. |
24 | LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 2117- 2125. |
25 | NAIR V, HINTON G E. Rectified linear units improve restricted boltzmann machines. Proc. of the 27th International Conference on Machine Learning, 2010, 807- 814. |
26 | IOFFE S, SZEGEDY C. Batch normalization:accelerating deep network training by reducing internal covariate shift. Proc. of the International Conference on Machine Learning, 2015, 448- 456. |
27 | HU P, RAMANAN D. Finding tiny faces. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1522- 1530. |
28 | LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation. IEEE Trans. on Pattern Analysis & Machine Intelligence, 2014, 39 (4): 640- 651. |
29 | ERHAN D, SZEGEDY C, TOSHEV A, et al. Scalable object detection using deep neural networks. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, 2147- 2154. |
30 | SZEGEDY C, REED S, ERHAN D, et al. Scalable, highquality object detection. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, 1- 8. |
31 | SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 761- 769. |
32 |
HUBER P J. Robust estimation of a location parameter. The Annals of Mathematical Statistics, 1964, 35 (1): 73- 101.
doi: 10.1214/aoms/1177703732 |
33 | EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 2010, 88 (2): 303- 338. |
34 | YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions. Proc. of the International Conference on Learning Recognition, 2016, 1- 13. |
35 | GLOROT X, BENGIO Y. Understanding the difficulty of training deep feedforward neural networks. Proc. of the 13th International Conference on Artificial Intelligence and Statistics, 2010, 249- 256. |
36 | HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN. Proc. of the IEEE International Conference on Computer Vision, 2017, 2980- 2988. |
[1] | Jun HAN, Weixing LI, Kai FENG, Feng PAN. Vision-based aerial image mosaicking algorithm with object detection [J]. Journal of Systems Engineering and Electronics, 2022, 33(2): 259-268. |
[2] | Zhengliang ZHU, Degui YANG, Junchao ZHANG, Feng TONG. Dataset of human motion status using IR-UWB through-wall radar [J]. Journal of Systems Engineering and Electronics, 2021, 32(5): 1083-1096. |
[3] | Tao YE, Zongyang ZHAO, Jun ZHANG, Xinghua CHAI, Fuqiang ZHOU. Low-altitude small-sized object detection using lightweight feature-enhanced convolutional neural network [J]. Journal of Systems Engineering and Electronics, 2021, 32(4): 841-853. |
[4] | Wantian WANG, Ziyue TANG, Yichang CHEN, Yongjian SUN. Parity recognition of blade number and manoeuvre intention classification algorithm of rotor target based on micro-Doppler features using CNN [J]. Journal of Systems Engineering and Electronics, 2020, 31(5): 884-889. |
[5] | Binquan LI, Xiaohui HU. Effective distributed convolutional neural network architecture for remote sensing images target classification with a pre-training approach [J]. Journal of Systems Engineering and Electronics, 2019, 30(2): 238-244. |
[6] | Jinbo CHEN, Zhiheng WANG, Hengyu LI. Real-time object segmentation based on convolutional neural network with saliency optimization for picking [J]. Journal of Systems Engineering and Electronics, 2018, 29(6): 1300-1307. |
[7] | Liangkui LIN, Shaoyou WANG, Zhongxing TANG. Using deep learning to detect small targets in infrared oversampling images [J]. Journal of Systems Engineering and Electronics, 2018, 29(5): 947-952. |
[8] | Xiaoping Shi, Rui Guo, Yi Zhu, and Zicai Wang. Astronomical image restoration using variational Bayesian blind deconvolution#br# [J]. Journal of Systems Engineering and Electronics, 2017, 28(6): 1236-1247. |
[9] | Bendong Zhao, Huanzhang Lu, Shangfeng Chen, Junliang Liu, and Dongya Wu. Convolutional neural networks for time series classification [J]. Systems Engineering and Electronics, 2017, 28(1): 162-. |
[10] | Rui Yao and Yanning Zhang. Compressive sensing for small moving space object detection in astronomical images [J]. Journal of Systems Engineering and Electronics, 2012, 23(3): 378-384. |
[11] | Qinkun Xiao, Nan Zhang, Fei Li, and Yue Gao. Object detection based on combination of local and spatial information [J]. Journal of Systems Engineering and Electronics, 2011, 22(4): 715-720. |
[12] | Jing Li, Junzheng Wang, and Wei Shen. Moving object detection in framework of compressive sampling [J]. Journal of Systems Engineering and Electronics, 2010, 21(5): 740-745. |
[13] | Xiaojun Sun and Zili Deng. Self-tuning measurement fusion white noise deconvolution estimator with correlated noises [J]. Journal of Systems Engineering and Electronics, 2010, 21(4): 666-674. |
[14] | Wang Yang, Zhang Naitong, Zhang Qinyu & Zhang Zhongzhao. Deconvolution techniques for characterizing indoor UWB wireless channel [J]. Journal of Systems Engineering and Electronics, 2008, 19(4): 688-693. |
[15] |
Wei Zhiqiang, Ji Xiaopeng & Wang Peng.
Real-time moving object detection for video monitoring systems
|
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||