
Journal of Systems Engineering and Electronics ›› 2025, Vol. 36 ›› Issue (6): 1389-1397.doi: 10.23919/JSEE.2025.000051
• ELECTRONICS TECHNOLOGY •
Jie YAN1(
), Yingmei WEI1(
), Yuxiang XIE1,*(
), Quanzhi GONG1(
), Shiwei ZOU1(
), Xidao LUAN2(
)
Received:2024-08-26
Accepted:2024-08-26
Online:2025-12-18
Published:2026-01-07
Contact:
Yuxiang XIE
E-mail:yanjie@nudt.edu.cn;weiyingmei@nudt.edu.cn;xyx89@163.com;charles_g27@qq.com;1530531454@qq.com;xidaoluan@ccsu.cn
About author:Jie YAN, Yingmei WEI, Yuxiang XIE, Quanzhi GONG, Shiwei ZOU, Xidao LUAN. The brief self-attention module for lightweight convolution neural networks[J]. Journal of Systems Engineering and Electronics, 2025, 36(6): 1389-1397.
Table 1
Comparison of experimental results on Food-101 with MobileNetV2"
| Model setup | Params/×106 | Top-1 accuracy/% | Top-5 accuracy/% |
| Baseline | 2.3 | ||
| +SE | 3.48 | ||
| ECA | 2.3 | ||
| +CBAM | 4.6 | ||
| +CA | 2.8 | ||
| +BSA | 2.3 | ||
| +BSA&SE | 3.48 | ||
| +BSA&ECA | 2.3 |
Table 2
Comparison of experimental results on Caltech-256 with MobileNetV2"
| Model setup | Params/×106 | Top-1 accuracy/% | Top-5 accuracy/% |
| Baseline | 2.55 | ||
| +SE | 3.7 | ||
| ECA | 2.55 | ||
| +CBAM | 4.8 | ||
| +CA | 3.0 | ||
| +BSA | 2.55 | ||
| +BSA&SE | 3.7 | ||
| +BSA&ECA | 2.55 |
Table 3
Comparison of experimental results on Mini-ImageNet with MobileNetV2"
| Model setup | Params/×106 | Top-1 accuracy/% | Top-5 accuracy/% |
| Baseline | 2.3 | ||
| +SE | 3.48 | ||
| ECA | 2.3 | ||
| +CBAM | 4.6 | ||
| +CA | 2.8 | ||
| +BSA | 2.3 | ||
| +BSA&SE | 3.48 | ||
| +BSA&ECA | 2.3 |
Table 4
Comparison of experimental results of several attention methods with different weight coefficients on Mini-ImageNet"
| Model setup | Params/×106 | Top-1 accuracy/% | Top-5 accuracy/% |
| Baseline-1.0 | 2.30 | ||
| +CBAM | 4.60 | ||
| +CA | 2.79 | ||
| +BSA | 2.30 | ||
| +BSA+SE | 3.48 | ||
| Baseline-0.75 | 1.48 | ||
| +CBAM | 2.76 | ||
| +CA | 1.74 | ||
| +BSA | 1.48 | ||
| +BSA+SE | 2.12 | ||
| Baseline-0.5 | 0.82 | ||
| +CBAM | 1.39 | ||
| +CA | 0.94 | ||
| +BSA | 0.82 | ||
| +BSA+SE | 1.10 |
Table 5
Comparison of the results on Food-101 between the recent advanced methods and our proposed one"
| Model setup | Backbone | Params of the backbone/×106 | Top-1 accuracy/% |
| Inception V3 [ | Inception V3 | 27.2 | 71.7 |
| SimCLR [ | ResNet-50 | 25.5 | 72.8 |
| BYOL [ | ResNet-50 | 25.5 | 75.3 |
| SEER [ | RegNet-8gf | 42 | 76.2 |
| SwAV [ | ResNet-50 | 25.5 | 76.4 |
| NNCLR [ | ResNet-50 | 25.5 | 76.7 |
| NAT-M2 [ | NAT-M2 | 4.1 | 88.5 |
| Grafit [ | ResNet-50 | 25.5 | 89.5 |
| EfficientNet B7 [ | EfficientNet-B7 | 66 | 93 |
| EfficientNet B0+BSA+SE | EfficientNet B0 | 5.3 | 83.4 |
| 1 | HU J, SHEN L, ALBANIE S. Squeeze-and-excitation networks. Proc. of the IEEE/CVF conference on Computer Vision and Pattern Recognition, 2018: 7132–7141. |
| 2 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module. Proc. of the European Conference on Computer Vision, 2018: 3–19. |
| 3 | FU J, LIU J, TIAN H J, et al. Dual attention network for scene segmentation. Proc. of the IEEE/CVF conference on Computer Vision and Pattern Recognition, 2019: 3146–3154. |
| 4 | ZHANG H, GOODFELLOW I, METAXAS D, et al. Self-attention generative adversarial networks. Proc. of the 36th International Conference on Machine Learning, 2019: 7354–7363. |
| 5 | PARK J, WOO S, LEE J Y, et al. BAM: bottleneck attention module. https://arxiv.org/abs/1807.06514v1. |
| 6 | WANG Q L, WU B G, ZHU P H, et al. ECA-Net: efficient channel attention for deep convolutional neural networks. Proc. of the IEEE/CVF conference on Computer Vision and Pattern Recognition, 2020: 11531–11539. |
| 7 | SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: inverted residuals and linear bottlenecks. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 4510–4520. |
| 8 | ZHOU D Q, HOU Q B, CHEN Y P, et al. Rethinking bottleneck structure for efficient mobile network design. Proc. of the European Conference on Computer Vision, 2020: 680–697. |
| 9 | BOSSARD L, GUILLAUMIN M, VAN GOOL L. Food-101–mining discriminative components with random forests. Proc. of the European Conference on Computer Vision, 2014: 446–461. |
| 10 | GRIFFIN G, HOLUB A, PERONA P. Caltech-256 object category dataset. California Institute of Technology, 2007: 1–20. |
| 11 | VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning. Advances in Neural Information Processing Systems, 2016, 29: 8909022. |
| 12 | IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. https://arxiv.org/abs/1602.07360v4. |
| 13 | GHOLAMI A, KWON K, WU B, et al. SqueezeNext: hardware-aware neural network design. Proc. of the IEEE/CVF conference on Computer Vision and Pattern Recognition, 2018: 1638–1647. |
| 14 | ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices. Proc. of the IEEE/CVF conference on Computer Vision and Pattern Recognition, 2018: 6848–6856. |
| 15 | HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks. Proc. of the IEEE/CVF conference on Computer Vision and Pattern Recognition, 2017: 4700–4708. |
| 16 | HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications. https://arXiv preprint arXiv: 1704.04861. |
| 17 | HOWARD A, SANDLER M, CHU G, et al. Searching for MobileNetV3. Proc. of the IEEE/CVF International Conference on Computer Vision, 2019: 1314–1324. |
| 18 | MEHTA S, RASTEGARI M, CASPI A, et al. ESPNet: efficient spatial pyramid of dilated convolutions for semantic segmentation. Proc. of the European Conference on Computer Vision, 2018: 552–568. |
| 19 | HUANG G, LIU S, VAN DER M L, et al. CondenseNet: an efficient DenseNet using learned group convolutions. Proc. of the IEEE/CVF conference on Computer Vision and Pattern Recognition, 2018: 2752–2761. |
| 20 | WANG R J, LI X, LING C X. Pelee: a real-time object detection system on mobile devices. Advances in Neural Information Processing Systems, 2018: 31: 1967−1976. |
| 21 | ZOPH B, LE Q V. Neural architecture search with reinforcement learning. https://arxiv.org/abs/1611.01578. |
| 22 | TAN M X, LE Q V. EfficientNet: rethinking model scaling for convolutional neural networks. Proc. of the 36th International Conference on Machine Learning, 2019: 6105–6114. |
| 23 | CAI H, ZHU L G, HAN S. ProxylessNAS: direct neural architecture search on target task and hardware. https://arxiv.org/abs/1812.00332. |
| 24 | WU B C, DAI X L, ZHANG P Z, et al. FBNet: hardware-aware efficient convnet design via differentiable neural architecture search. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 10734–10742. |
| 25 | MA N N, ZHANG X Y, HUANG J W, et al. WeightNet: revisiting the design space of weight networks. Proc. of the European Conference on Computer Vision, 2020: 776–792. |
| 26 | MNIH V, HEESS N, GRAVES A, et al. Recurrent models of visual attention. Advances in Neural Information Processing Systems, 2014: 2204-2212. |
| 27 | HUANG Z L, WANG X G, WEI Y C, et al. CCnet: criss-cross attention for semantic segmentation. Proc. of the IEEE/CVF International Conference on Computer Vision, 2019: 603–612. |
| 28 | LI X, ZHONG Z S, WU J L, et al. Expectation-maximization attention networks for semantic segmentation. Proc. of the IEEE/CVF International Conference on Computer Vision, 2019: 9167–9176. |
| 29 | ROBBINS H, MONRO S. A stochastic approximation method. The Annals of Mathematical statistics, 1951: 400–407. |
| 30 | DENG J, DONG W, SOCHER R, et al. Imagenet: a large-scale hierarchical image database. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2009: 248–255. |
| 31 | GALLO I, RIA G, LANDRO N, et al. Image and text fusion for UPMC Food-101 using BERT and CNNs. Proc. of the 35th International Conference on Image and Vision Computing New Zealand, 2020. DOI: 10.1109/IVCNZ51579.2020.9290622. |
| 32 | CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations. Proc. of the International Conference on Machine Learning, 2020: 1597–1607. |
| 33 | GRILL J B, STRUB F, ALTCHÉ F, et al. Bootstrap your own latent-a new approach to self-supervised learning. Advances in Neural Information Processing Systems, 2020: 21271–21284. |
| 34 | GOYAL P, DUVAL Q, SEESSEL I, et al. Vision models are more robust and fair when pretrained on uncurated images without supervision. https://arxiv.org/abs/2202.08360. |
| 35 | CARON M, MISRA I, MAIRAL J, et al. Unsupervised learning of visual features by contrasting cluster assignments. Advances in Neural Information Processing Systems, 2020: 9912–9924. |
| 36 | DWIBEDI D, AYTAR Y, TOMPSON J, et al. With a little help from my friends: nearest-neighbor contrastive learning of visual representations. Proc. of the IEEE/CVF International Conference on Computer Vision, 2021: 9588–9597. |
| 37 |
LU Z, SREEKUMAR G, GOODMAN E, et al Neural architecture transfer. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2021, 43 (9): 2971- 2989.
doi: 10.1109/TPAMI.2021.3052758 |
| 38 | TOUVRON H, SABLAYROLLES A, DOUZE M, et al. Grafit: learning fine-grained image representations with coarse labels. Proc. of the IEEE/CVF International Conference on Computer Vision, 2021: 874–884. |
| [1] | Shichang WAN, Hao LI, Yahui HU, Xuhua WANG, Siyuan CUI. A multi target intention recognition model of drones based on transfer learning [J]. Journal of Systems Engineering and Electronics, 2025, 36(5): 1247-1258. |
| [2] | Hong WANG, Delanyo Kwame Bensah KULEVOME, Zi’an ZHAO. An integrated PHM framework for radar systems through system structural decomposition [J]. Journal of Systems Engineering and Electronics, 2025, 36(1): 95-107. |
| [3] | Yuxiang XIE, Quanzhi GONG, Xidao LUAN, Jie YAN, Jiahui ZHANG. A survey of fine-grained visual categorization based on deep learning [J]. Journal of Systems Engineering and Electronics, 2024, 35(6): 1337-1356. |
| [4] | Chuanfei ZANG, Yumiao WANG, Xiang WANG, Congan XU, Guolong CUI. Sea clutter suppression via cuttable encoder-decoder-augmentation network [J]. Journal of Systems Engineering and Electronics, 2024, 35(6): 1428-1440. |
| [5] | Ai GAO, Shengnan XU, Zichen ZHAO, Haibin SHANG, Rui XU. Fault diagnosis method of link control system for gravitational wave detection [J]. Journal of Systems Engineering and Electronics, 2024, 35(4): 922-931. |
| [6] | Rong FAN, Chengke SI, Yi HAN, Qun WAN. RFFsNet-SEI: a multidimensional balanced-RFFs deep neural network framework for specific emitter identification [J]. Journal of Systems Engineering and Electronics, 2024, 35(3): 558-574. |
| [7] | Xinwei OU, Zhangxin CHEN, Ce ZHU, Yipeng LIU. Low rank optimization for efficient deep learning: making a balance between compact architecture and fast training [J]. Journal of Systems Engineering and Electronics, 2024, 35(3): 509-531. |
| [8] | Chen CHEN, Wei QUAN, Zhuang SHAO. Aerial target threat assessment based on gated recurrent unit and self-attention mechanism [J]. Journal of Systems Engineering and Electronics, 2024, 35(2): 361-373. |
| [9] | Dada ZHAO, Kai DING, Xiaogang QI, Yu CHEN, Hailin FENG. Sound event localization and detection based on deep learning [J]. Journal of Systems Engineering and Electronics, 2024, 35(2): 294-301. |
| [10] | Xiaolong XU, Shuai JIANG, Jinbo ZHAO, Xinheng WANG. DCEL: classifier fusion model for Android malware detection [J]. Journal of Systems Engineering and Electronics, 2024, 35(1): 163-177. |
| [11] | Yuyuan ZHANG, Wenjun YAN, Limin ZHANG, Qing LING. FOLMS-AMDCNet: an automatic recognition scheme for multiple-antenna OFDM systems [J]. Journal of Systems Engineering and Electronics, 2023, 34(2): 307-323. |
| [12] | Siting LYU, Xiaohui LI, Tao FAN, Jiawen LIU, Mingli SHI. Deep learning for fast channel estimation in millimeter-wave MIMO systems [J]. Journal of Systems Engineering and Electronics, 2022, 33(6): 1088-1095. |
| [13] | Haifen YANG, Hao ZHANG, Houjun WANG, Zhengyang GUO. A novel approach for unlabeled samples in radiation source identification [J]. Journal of Systems Engineering and Electronics, 2022, 33(2): 354-359. |
| [14] | Tao YE, Zongyang ZHAO, Jun ZHANG, Xinghua CHAI, Fuqiang ZHOU. Low-altitude small-sized object detection using lightweight feature-enhanced convolutional neural network [J]. Journal of Systems Engineering and Electronics, 2021, 32(4): 841-853. |
| [15] | Zhao SUN, Chao MA, Liang WANG, Ran MENG, Shanshan PEI. A deep learning-based binocular perception system [J]. Journal of Systems Engineering and Electronics, 2021, 32(1): 7-20. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||