Leveraging Transfer Learning to Identify Food Categories

Kolla, J. V. V.; Vemula, P. Ch.; Chakravarthy, S.; Naidu, B. S.; Patibandla, D.

doi:10.12913/22998624/142738

Artykuł - szczegóły

Tytuł artykułu

Leveraging Transfer Learning to Identify Food Categories

Autorzy

Kolla J. V. V. , Vemula P. Ch. , Chakravarthy S. , Naidu B. S. , Patibandla D.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.12913/22998624/142738

Warianty tytułu

Języki publikacji

Abstrakty

In today’s scenario, recognition of pictured food dishes automatically has significant importance. During the COVID-19 pandemic, there was a decline in people visiting restaurants for their dietary requirements. So many restaurants started offering their services online. This situation caused a demand for better categorization of food into various categories on a large scale by companies that facilitated these services. It is challenging to congregate a large dataset of food categories, so it is complex to build a generalized architecture. To solve this issue, In this paper, domain-specific transfer learning is used to build the model using some standard architectures like VGGNET, RESNET, and EFFICIENTNET family, which are trained on popular benchmark datasets such as IMAGENET, COCO, etc. The similarity between the source and target datasets is calculated to find the best source dataset, and the one with the highest similarity is chosen for transfer learning. The solution proposed in this paper outperforms some of the existing works on categorizing food items.

Słowa kluczowe

convolutional neural networks transfer learning domain similarity fine-tuning

konwolucyjne sieci neuronowe uczenie się przez przeniesienie podobieństwo dziedzin dostrajanie

Wydawca

Lublin University of Technology
Polish Society of Ecological Engineering (PTIE), Branch of PTIE in Lublin

Czasopismo

Advances in Science and Technology. Research Journal

Rocznik

2021

Tom

Vol. 15, no 4

Strony

101--109

Opis fizyczny

Bibliogr. 30 poz., fig.

Twórcy

autor

Kolla J. V. V.

kjvvnat@gmail.com

Department of Computer Science, Gitam University, Vishakapatnam, India

autor

Vemula P. Ch.

poorna883@gmail.com

Department of Computer Science, Vellore Institute of Technology, Vellore, India

https://orcid.org/0000-0002-4857-8245

autor

Chakravarthy S.

Department of Computer Science, Gitam University, Vishakapatnam, India

autor

Naidu B. S.

Department of Computer Science, Gitam University, Vishakapatnam, India

autor

Patibandla D.

Department of Computer Science, Gitam University, Vishakapatnam, India

Bibliografia

1. Cui Y., Song Y., Sun C., Howard A., Belongie S. Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. 2018;4109-4118. DOI: 10.1109/CVPR.2018.00432.
2. Simonyan K., Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. 2014;1.
3. He K., Zhang X., Ren S., Sun J. Deep Residual Learning for Image Recognition. 2016;770-778. DOI: 10.1109/CVPR.2016.90.
4. Tan M., Le Q. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks; 2019.
5. Martin C.K., Correa J., Han H., Allen H., Rood J., Champagne C., Gunturk B., Bray G. Validity of the Remote Food Photography Method (RFPM) for Estimating Energy and Nutrient Intake in Near Real‐Time. Obesity. 2012;20.
6. Noronha J., Hysen, E., Zhang, H., Gajos K.Z. 2011. Platemate: crowdsourcing nutritional anal¬ysis from food photographs. Proceedings of the 24th annual ACM symposium on User interface software and technology.
7. Breiman L. Random Forests. Machine Learning. 2001;45:5–32. https://doi. org/10.1023/A:1010933404324
8. Ho T.K. Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition. 1995;1:278-282. DOI: 10.1109/ICDAR.1995.598994.
9. Doersch C., Gupta A., Efros A.A. Mid-Level Visual Element Discovery as Discriminative Mode Seeking. NIPS. 2013.
10. Sun J., Ponce J. Learning Discriminative Part Detectors for Image Classification and Cosegmentation, IEEE International Conference on Computer Vision 2013, 3400-3407. DOI: 10.1109/ ICCV.2013.422.
11. Wang X., Wang B., Bai X., Liu W., Tu Z. Max-margin multiple-instance dictionary learning. NIPS. 2013.
12. Li Q., Wu J., Tu Z. Harvesting Mid-level Visual Concepts from Large-Scale Internet Images. IEEE Conference on Computer Vision and Pattern Recognition 2013, 851-858.
13. Endres I., Shih K., Jiaa J., Hoiem D. Learning Collections of Part Models for Object Recognition. CVPR. 2013.
14. Juneja M., Vedaldi A., Jawahar C.V., Zisserman A. Blocks That Shout: Distinctive Parts for Scene Classification. IEEE Conference on Computer Vision and Pattern Recognition 2013, 923-930.
15. Singh S., Gupta A., Efros A.A. Unsupervised Discovery of Mid-Level Discriminative Patches. ECCV, 2012.
16. Yao B., Khosla A., Fei-Fei L. 2011. Combining randomization and discrimination for fine-grained image categorization. CVPR. 2011;1577-1584.
17. Deng J., Dong W., Socher R., Li L., Li K., Fei-Fei L. ImageNet: A large-scale hierarchical image database. IEEE Conference on Computer Vision and Pattern Recognition 2009, 248-255. DOI: 10.1109/ CVPR.2009.5206848.
18. Lin T., Maire M., Belongie S.J., Hays J., Perona P., Ramanan D., Dollár P., Zitnick C.L. Microsoft COCO: Common Objects in Context. ECCV. 2014.
19. Razavian A.S., Azizpour H., Sullivan J., Carlsson S. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition. IEEE Conference on Computer Vision and Pattern Recognition Workshops 2014, 512-519. DOI: 10.1109/CVPRW.2014.131.
20. Donahue J., Jia Y., Vinyals O., Hoffman J., Zhang N., Tzeng E., Darrell T. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recog-nition. ICML. 2014.
21. Zhou B., Lapedriza A., Khosla A., Oliva A., Torralba A. Places: A 10 Million Image Database for Scene Recognition. IEEE transactions on pattern analysis and machine intelligence. 2018;40(6):1452–1464. DOI: 10.1109/TPAMI.2017.2723009.
22. Girshick R., Donahue J., Darrell T., Malik J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. IEEE Conference on Computer Vision and Pattern Recognition 2014, 580-587. DOI: 10.1109/CVPR.2014.81.
23. Oquab M., Bottou L., Laptev I., Sivic J. Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks. IEEE Conference on Computer Vision and Pattern Recognition 2014, 1717-1724. DOI: 10.1109/ CVPR.2014.222.
24. Rachev S.T. The monge–kantorovich mass transference problem and its stochastic applications. Theory of Probability & Its Applications. 1985.
25. Rubner Y., Tomasi C., Guibas L. The Earth Mover’s Distance as a Metric for Image Retrieval. International Journal of Computer Vision. 2004;40:99-121.
26. Bossard L., Guillaumin M., Van Gool L. Food- 101 – Mining Discriminative Components with Random Forests. In: Fleet D., Pajdla T., Schiele B., Tuytelaars T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, Springer, Cham; 2014;8694. DOI: 10.1007/978-3- 319-10599-4_29
27. Kingma D.P., Ba J. Adam: A Method for Stochastic Optimization. 2015.
28. Bengio Y. Deep learning of representations for unsupervised and transfer learning. In ICML Workshop on Unsupervised and Transfer Learning. 2012;1
29. Guo Y., Shi H., Kumar A., Grauman K., Simunic T., Feris R. SpotTune: Transfer Learning Through Adaptive Fine-Tuning. IEEE/CVF Conference on Computer Vision and Pattern Recognition 2019, 4800-4809.
30. Liu C., Cao Y., Luo Y., Chen G., Vokkarane V., Ma Y. DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment. 2016.

Uwagi

Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2021).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-92cd3fa0-c7f7-4efa-8764-eaf0d835e385