PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

A few-shot fine-grained image recognition method

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Deep learning methods benefit from data sets with comprehensive coverage (e.g., ImageNet, COCO, etc.), which can be regarded as a description of the distribution of real-world data. The models trained on these datasets are considered to be able to extract general features and migrate to a domain not seen in downstream. However, in the open scene, the labeled data of the target data set are often insufficient. The depth models trained under a small amount of sample data have poor generalization ability. The identification of new categories or categories with a very small amount of sample data is still a challenging task. This paper proposes a few-shot fine-grained image recognition method. Feature maps are extracted by a CNN module with an embedded attention network to emphasize the discriminative features. A channel-based feature expression is applied to the base class and novel class followed by an improved cosine similarity-based measurement method to get the similarity score to realize the classification. Experiments are performed on main few-shot benchmark datasets to verify the efficiency and generality of our model, such as Stanford Dogs, CUB-200, and so on. The experimental results show that our method has more advanced performance on fine-grained datasets.
Rocznik
Strony
art. no. e144584
Opis fizyczny
Bibliogr. 42 poz., rys., tab.
Twórcy
autor
  • College of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China
  • College of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin 150050, China
autor
  • College of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China
Bibliografia
  • [1] J. Howard and S. Ruder, “Universal language model fine-tuning for text classification,” in ACL 2018 – 56th Annual Meeting of the Association for Computational Linguistics, 2018, doi: 10.18653/v1/p18-1031.
  • [2] S. Kornblith, J. Shlens, and Q.V. Le. “Do better imagenet models transfer better?” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, pp. 2661–2671, doi: 10.48550/arXiv.1805.08974.
  • [3] K. Cao, M. Brbic, and J. Leskovec, “Concept learners for few-shot learning,” in ICLR 2021, 2021, doi: 10.48550/arXiv.2007.07375.
  • [4] G. Koch, R. Zemel, and R. Salakhutdinov, “Siamese neural networks for one-shot image recognition,” in ICML deep learning workshop, vol. 2, 2015, [online]. [Available]: http://www.cs.toronto.edu/~gkoch.
  • [5] J. Snell, K. Swersky, and R. Zemel, “Prototypical networks for few-shot learning,” Advances in Neural information Processing Systems, 2017, p. 30, doi: 10.48550/arXiv.1703.05175.
  • [6] T. Yu et al., “One-shot imitation from observing humans via domain-adaptive meta-learning,” arXiv preprint, 2018, doi: 10.48550/arXiv.1802.01557.
  • [7] H.S. Behl et al., “Meta-Learning Deep Visual Words for Fast Video Object Segmentation,” in 2020 IEEE/RSJ international Conference on intelligent Robots and Systems (IROS), 2020, pp. 8484–8491, doi: 10.48550/arXiv.1812.01397.
  • [8] K. Hsu, S. Levine, and C. Finn, “Unsupervised learning via meta-learning,” arXiv preprint, 2018, doi: 10.48550/arXiv.1810.02334.
  • [9] J. Lu et al., “Learning from very few samples: A survey,” arXiv preprint, 2020, doi: 10.48550/arXiv.2009.02653.
  • [10] H. Chen et al., “Sparse spatial transformers for few-shot learning,” arXiv preprint, 2021, doi: 10.48550/arXiv.2109.12932.
  • [11] Z. Peng et al., “Few-shot image recognition with knowledge transfer,” in Proceedings of the IEEE/CVF international Conference on Computer Vision, 2019, pp.441–449, doi: 10.1109/ICCV.2019.00053.
  • [12] F. Hao et al., “Collect and select: Semantic alignment metric learning for few-shot learning,” in Proceedings of the IEEE/CVF international Conference on Computer Vision, 2019, pp. 8460–8469, doi: 10.1109/ICCV.2019.00855.
  • [13] F. Wu et al., “Attentive prototype few-shot learning with capsule network-based embedding,” in European Conference on Computer Vision, 2020, pp. 237–253, doi: 10.1007/978-3-030-58604-1_15.
  • [14] D. Kang et al., “Relational Embedding for Few-Shot Classification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 8822–8833, doi: 10.48550/arXiv.2108.09666.
  • [15] H. Tang et al., “Learning Attention-Guided Pyramidal Features for Few-shot Fine-grained Recognition,” Pattern Recognit., vol. 130, p. 108792, 2022, doi: 10.1016/j.patcog.2022.108792.
  • [16] S. Tian, H. Tang, and L. Dai, “Coupled Patch Similarity Network FOR one-Shot Fine-Grained Image Recognition,” in 2021 IEEE international Conference on Image Processing (ICIP), 2021, pp. 2478–2482, doi: 10.1109/ICIP42928.2021.9506685.
  • [17] B. Oreshkin, P. Rodríguez López, and A. Lacoste, “Tadam: Task dependent adaptive metric for improved few-shot learning,” Advances in Neural information Processing Systems, 2018, p. 31, doi: 10.48550/arXiv.1805.10123.
  • [18] P.C. Ng and S. Henikoff, “SIFT: Predicting amino acid changes that affect protein function,” Nucleic Acids Res., vol. 31, no. 13, pp. 3812–3814, 2003, doi: 10.1093/nar/gkg509.
  • [19] N. Datal, “Histograms of oriented gradients for human detection,” in 2005 IEEE Computer Society Conference on Computer Vision And Pattern Recognition (CVPR’05), 2005, pp. 886–893, doi: 10.1109/CVPR.2005.177.
  • [20] V. Devisurya, R. Devi Priya, and N. Anitha, “Early detection of major diseases in turmeric plant using improved deep learning algorithm,” Bull. Pol. Acad. Sci. Tech. Sci., vol. 70, no. 2, p. e140689, 2022, doi: 10.24425/bpasts.2022.140689.
  • [21] Z. Jiang et al., “Few-shot classification via adaptive attention,” Computer Vision and Pattern Recognition, 2020, doi: 10.48550/arXiv.2008.02465.
  • [22] D. Wang et al., “Learning a tree-structured channel-wise refinement network for efficient image deraining,” in 2021 IEEE international Conference on Multimedia and Expo (ICME), 2021, pp. 1–6, doi: 10.1109/ICME51207.2021.9428187.
  • [23] J.S. Lim et al., “Small object detection using context and attention,” in 2021 international Conference on Artificial intelligence in information and Communication (ICAIIC), 2021, pp. 181–186, doi: 10.48550/arXiv.1912.06319.
  • [24] X. Wang et al., “Non-local neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7794–7803, doi: 10.1109/CVPR.2018.00813.
  • [25] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7132–7141, doi: 10.48550/arXiv.1709.01507.
  • [26] S. Woo et al., “Cbam: Convolutional block attention module,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 3–19, doi: 10.48550/arXiv.1807.06521.
  • [27] J. Fu et al., “Dual attention network for scene segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision And Pattern Recognition, 2019, pp. 3146–3154, doi: 10.1109/CVPR.2019.00326.
  • [28] H. Tang et al., “Blockmix: meta regularization and self-calibrated inference for metric-based meta-learning,” in Proceedings of the 28th ACM international Conference on Multimedia, 2020, pp. 610–618, doi: 10.1145/3394171.3413884.
  • [29] L. Yang and R. Jin, “Distance metric learning: A comprehensive survey,” Department of Computer Science and Engineering, Michigan State Universiy, vol. 2, 2006.
  • [30] O. Vinyals et al., “Matching networks for one shot learning,” Advances in Neural information Processing Systems, 2016, p. 29, doi: 10.48550/arXiv.1606.04080.
  • [31] C. Zhang, Y. Cai, G. Lin, and C. Shen, “DeepEMD: Differentiable Earth Mover’s Distance for Few-Shot Learning,” IEEE Trans. Pattern Anal. Mach. intell., 2022, doi: 10.1109/TPAMI.2022.3217373.
  • [32] H. Huang et al., “Local descriptor-based multi-prototype network for few-shot Learning,” Pattern Recognit., vol. 116, p. 107935, 2021, doi: 10.1016/j.patcog.2021.107935.
  • [33] W. Li et al., “Revisiting local descriptor based image-to-class measure for few-shot learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7260–7268, doi: 10.48550/arXiv.1903.12290.
  • [34] C. Wah et al., “The caltech-ucsd birds-200-2011 dataset,” [online]. [Available]: http://www.vision.caltech.edu/visipedia/CUB-200.html.
  • [35] A. Khosla et al., “Novel dataset for fine-grained image categorization: Stanford dogs,” in Proc. CVPR Workshop on Fine-Grained Visual Categorization (FGVC), 2011, vol. 2. no. 1.
  • [36] J. Krause, M. Stark, J. Deng, and L. Fei-Fei, “3D object representations for fine-grained categorization,” in Proceedings of the IEEE international Conference on Computer Vision Workshops, 2013, pp. 554–561, doi: 10.1109/ICCVW.2013.77.
  • [37] C. Simon, P. Koniusz, R. Nock, and M. Harandi, “Adaptive Subspaces for Few-Shot Learning,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), USA, 2020, pp. 4135–4144, doi: 10.1109/CVPR42600.2020.00419.
  • [38] W. Xu et al., “Attentional constellation nets for few-shot learning,” in International Conference on Learning Representations, 2021, https://par.nsf.gov/servlets/purl/10278170.
  • [39] Z. Chen et al., “Pareto self-supervised training for few-shot learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 13663–13672, doi: 10.1109/CVPR46437.2021.01345.
  • [40] D. Wertheimer, L. Tang, and B. Hariharan, “Few-shot classification with feature map reconstruction networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 8012–8021, doi: 10.1109/CVPR46437.2021.00792.
  • [41] Z. Zhou et al., “Binocular mutual learning for improving few-shot classification,” in Proceedings of the IEEE/CVF international Conference on Computer Vision, 2021, pp. 8402–8411, doi: 10.1109/ICCV48922.2021.00829.
  • [42] J. Xie et al., “Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7972–7981, doi: 10.48550/arXiv.2204.04567.
Uwagi
Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2022-2023).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-b0e03282-2738-420b-a137-ad5bac6d5e54
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.