Systematic analysis and review of video object retrieval techniques

Ghuge, C. A.; Prakash, V. Chandra; Ruikar, Sachin D.

Artykuł - szczegóły

Tytuł artykułu

Systematic analysis and review of video object retrieval techniques

Autorzy

Ghuge C. A. , Prakash V. Chandra , Ruikar Sachin D.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

Video object retrieval is a promising research direction, developing in the recent years, and the current video object retrieval strategies are used for visualizing, digitizing, modeling, and retrieving the objects especially in graphics and in architectural design. The research performed led to the design of proficient video object retrieval techniques. Yet, although, a number of algorithms had been devised for tracking objects, the problems persist in enhancing the performance, for instance – with regard to non-rigid objects. In this review article we provide a detailed survey of 50 research papers presenting the suggested video object retrieval methodologies, based on approaches such as deep learning techniques, graph-based techniques, query-based techniques, feature-based techniques, fuzzybased techniques, machine learning-based techniques, distance metric learning-based technique, and also other ones. Moreover, analysis and discussion are presented concerning the year of publication, employed methodology, evaluation metrics, accuracy range, adopted framework, datasets utilized, and the implementation tool. Finally, the research gaps and issues related to various proposed video object retrieval schemes are presented for guiding the researchers towards improved contributions to the video object retrieval methods.

Słowa kluczowe

video object retrieval computer vision deep learning fuzzy-based techniques machine learning query-based techniques graph-based techniques

Wydawca

Systems Research Institute, Polish Academy of Sciences

Czasopismo

Control and Cybernetics

Rocznik

2020

Tom

Vol. 49, No. 4

Strony

471--498

Opis fizyczny

Bibliogr. 99 poz., rys., tab.

Twórcy

autor

Ghuge C. A.

caghugeklu@gmail.com

Department of Computer Science and Engineering, KoneruLakshmaiah Education Foundation, Vaddeswaram, Guntur district, AP, India

autor

Prakash V. Chandra

Department of Computer Science and Engineering, KoneruLakshmaiah Education Foundation, Vaddeswaram, Guntur district, AP, India

autor

Ruikar Sachin D.

Department of Electronics Engineering, Walchand College of Engineering, Sangli, Maharashtra, India

Bibliografia

Ahmed, S.A., Dogra, D.P., Kar, S. and Roy, P.P. (2019) Trajectorybased surveillance analysis: A survey. IEEE Transactions on Circuits and Systems for Video Technology, 29(7): 1985-1997.
Anjulan, A. and Canagarajah, N. (2007) Object-based video retrieval with local region tracking. Signal Processing, Image Communication, 22(7-8): 607-621.
Arandjelovic¸, R. and Zisserman, A. (2011) Smooth object retrieval using a bag of boundaries. In: Proceedings of International Conference on Computer Vision, IEEE, 375-382.
Araujo, A. and Girod, B. (2018) Large-scale video retrieval using image queries. IEEE Transactions on Circuits and Systems for Video Technology, 28(6): 1406-1420.
Ardizzone, E. and La Cascia, M. (1997) Automatic video database indexing and retrieval. Multimedia Tools and Applications, 4: 29-56.
Arman, F., Depommier, R., Hsu, A. and Chiu, M.Y. (1994) Contentbased browsing of video sequences. In: Proceedings of the Second ACM International Conference on Multimedia, 97-103.
Arroyo, R., Yebes, J.J., Bergasa, L.M., Daza, I.G. and Almazán, J. (2015) Expert video-surveillance system for real-time detection of suspicious behaviors in shopping malls. Expert Systems with Applications, 42(21): 7991-8005.
Aslandogan, Y.A. and Yu, C.T. (1999) Techniques and systems for image and video retrieval. IEEE Transactions on Knowledge and Data Engineering, 11(1): 56-63.
Bency, A.J., Karthikeyan, S., De Leo, C., Sunderrajan, S. and Manjunath, B.S. (2017) Search tracker: human-derived object tracking in the wild through large-scale search and retrieval. IEEE Transactions on Circuits and Systems for Video Technology, 27(8): 1803-1814.
Birkas, D., Birkas, K. and Popa, T. (2016) A mobile system for scene monitoring and object retrieval. In: Proceedings of the 29th International Conference on Computer Animation and Social Agents, ACM, 83-88.
Broilo, M., Piotto, N., Boato, G., Conci, N. and De Natale, F.G. (2010) Object trajectory analysis in video indexing and retrieval application. In: Video Search and Mining, Springer, Berlin, Heidelberg, 3-32.
Cai, Z., Liang, Y., Hu, H. and Luo, W. (2015) Offline video object retrieval method based on color features. In: Proceedings of International Symposium on Computational Intelligence and Intelligent Systems, Springer, 495-505.
Cao, X., Wang, F., Zhang, B., Fu, H. and Li, C. (2016) Unsupervised pixel-level video foreground object segmentation via shortest path algorithm. Neurocomputing, 172, 235-243.
Castanon, G., Elgharib, M., Saligrama, V. and Jodoin, P.M. (2016) Retrieval in long-surveillance videos using user-described motion and object attributes. IEEE Transactions on Circuits and Systems for Video Technology, 26(12): 2313-2327.
Chen, Y., Li, X., Dick, A. and Hill R. (2014) Ranking consistency for image matching and object retrieval. Pattern Recognition, 47(3): 1349-1360.
Cheng, H.Y. and Hwang, J.N. (2011) Integrated video object tracking with applications in trajectory-based event detection. Journal of Visual Communication and Image Representation, 22(7): 673- 685.
Chou, C.L., Chen, H.T. and Lee, S.Y. (2015) Pattern-based nearduplicate video retrieval and localization on web-scale videos. IEEE Transactions on Multimedia, 17(3): 382-395.
Chuang, C.H., Cheng, S.C., Chang, C.C. and Chen, P.P. (2014)Model based approach to spatial-temporal sampling of video clips for video object detection by classification. Journal of Visual Communication and Image Representation, 25(5): 1018-1030.
Czúni, L. and Rashad, M. (2017) The use of IMUs for video object retrieval in lightweight devices. Journal of Visual Communication and Image Representation, 48, 30-42.
Czúni, L. and Rashad, M. (2018) Lightweight Active Object Retrieval with Weak Classifiers. Sensors, 18(3): 801.
DeMenthon, D. and Doermann, D. (2003) Video retrieval using spatiotemporal descriptors. In: Proceedings of the Eleventh ACM International Conference on Multimedia, 508-517.
Dimitrova, N. and Golshani, F. (1995) Motion recovery for video content classification. ACM Transactions on Information Systems (TOIS), 13(4): 408-439.
Ding, S., Li, G., Li, Y., Li, X., Zhai, Q., Champion, A.C., Zhu, J., Xuan, D. and Zheng, Y.F. (2017) Survsurf: human retrieval on large surveillance video data. Multimedia Tools and Applications, 76(5): 6521-6549.
Eidenberger, H., Breiteneder, C. and Hitz, M. (2002) A framework for visual information retrieval. In: International Conference on Advances in Visual Information Systems, 105-116, Springer.
Fan, C.T., Wang, Y.K. and Huang, C.R. (2017) Heterogeneous information fusion and visualization for a large-scale intelligent video surveillance system. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 47 (4), 1-12.
Farag, W.E. and Abdel-Wahab, H. (2003) A human-based technique for measuring video data similarity. In: Proceedings of the Eighth IEEE Symposium on Computers and Communications (ISCC), 769-774.
Fernandez-Beltran, R. and Pla, F. (2015) Incremental probabilistic latent semantic analysis for video retrieval. Image and Vision Computing, 38, 1-12.
Gao, Y., Tang, J., Hong, R., Yan, S., Dai, Q., Zhang, N. and Chua, T.S. (2012) Camera constraint-free view-based 3-D object retrieval. IEEE Transactions on Image Processing, 21(4): 2269-2281.
Ghuge, C.A., Prakash, V.C. and Ruikar, S.D. (2018a) Weighed query specific distance and hybrid NARX neural network for video object retrieval. The Computer Journal, 63(11): 1738-1755.
Ghuge, C.A., Prakash, V. C. and Ruikar, S.D. (2018b) Query-specific distance and hybrid tracking model for video object retrieval. Journal of Intelligent Systems, 27(2): 195-212.
Ghuge, C.A., Prakash, V. C. and Ruikar, S.D. (2018c) Support vector regression and extended nearest neighbor for video object retrieval. Evolutionary Intelligence. DOI https://doi.org/10.1007/s12065-018-0176-y
Gomez-Conde, I. and Olivieri, D.N. (2015) A KPCA spatio-temporal differential geometric trajectory cloud classifier for recognizing human actions in a CBVR system. Expert Systems with Applications, 42(13): 5472-5490.
Gomez-Romero, J., Patricio, M.A., Garcia, J. and Molina, J.M. (2011) Ontology-based context representation and reasoning for object tracking and scene interpretation in video. Expert Systems with Applications, 38(6): 7494-7510.
Gong, B., Liu, J., Wang, X. and Tang, X. (2013) Learning semantic signatures for 3D object retrieval. IEEE Transactions on Multimedia, 15(2): 369-377.
Gong, J. and Caldas, C.H. (2011) An object recognition, tracking, and contextual reasoning-based video interpretation method for rapid productivity analysis of construction operations. Automation in Construction, 20(8): 1211-1226.
Guo, H., Wang, J. and Lu, H. (2016) Multiple deep features learning for object retrieval in surveillance videos. IET Computer Vision, 10(4): 268-272.
Guo, H., Wang, J., Xu, M., Zha, Z.J. and Lu, H. (2015) Learning multiview deep features for small object retrieval in surveillance scenarios. In: Proceedings of the 23rd ACM International Conference on Multimedia, 859-862.
Hao, Y., Mu, T., Hong, R., Wang, M., An, N. and Goulermas, J.Y. (2017) Stochastic multiview hashing for large-scale near-duplicate video retrieval. IEEE Transactions on Multimedia, 19(1): 1-14.
Haseyama, M., Ogawa, T. and Yagi, N. (2013) A Review of Video Retrieval Based on Image and Video Semantic Understanding. ITE Transactions on Media Technology and Applications (MTA), 1(1).
Hong, C., Li, N., Song, M., Bu, J. and Chen, C. (2011) An efficient approach to content-based object retrieval in videos. Neurocomputing, 74(17): 3565-3575.
Hong, R., Hu, Z., Wang, R., Wang, M. and Tao, D. (2016) Multi-view object retrieval via multi-scale topic models. IEEE Transactions on Image Processing, 25(12): 5814-5827.
Hou, S., Zhou, S. and Siddique, M.A. (2014) A compressed sensing approach for query by example video retrieval. Multimedia Tools and Applications, 72(3): 3031-3044.
Hu, W., Tan., Wang, L. and Maybank, S. (2004) A survey on visual surveillance of object motion and behaviours. IEEE Transaction on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 34(3):334-352.
Hu, W., Xie, N., Li, L., Zeng, X. and Maybank, S. (2011) A survey on visual content-based video indexing and retrieval. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 41(6): 797-819.
Hu, X., Tang, Y. and Zhang, Z. (2008) Video object matching based on SIFT algorithm. In: Proceedings of International Conference on Neural Networks and Signal Processing, IEEE, 412-415.
Inbavalli, M., Sathya, G. and Manjula, R. (2017) A Study of Multimedia Visuals and Information Retrieval and Techniques. International Journal of Scientific & Engineering Research, 8(4): 118-121.
Ji, X. and Liu, H. (2009) Advances in view-invariant human motion analysis: a review. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 50, 456-465.
Jin, R. and Kim, J (2017) Tracking feature extraction techniques with improved SIFT for video identification. Multimedia Tools Appl., 76(4):5927-5936.
Joy, E. and Peter, J.D. (2018) Visual tracking with conditionally adaptive multiple template update scheme for intricate videos. Multimedia Systems, 24(2): 175-194.
Kanagamalliga, S. and Vasuki, S. (2018) Contour-based object tracking in video scenes through optical flow and Gabor features. Optik, 157, 787-797.
Kim, C. and Hwang, J.N. (2000) An integrated scheme for object based video abstraction. In: Proceedings of the Eighth ACM International Conference on Multimedia, 303-311.
Kuo, Y.H., Cheng, W.H., Lin, H.T. and Hsu, W.H. (2012) Unsupervised semantic feature discovery for image object retrieval and tag refinement. IEEE Transactions on Multimedia, 14(4): 1079-1090.
Kuo, Y.H., Lin, H.T., Cheng, W.H., Yang, Y.H. and Hsu, W.H. (2011) Unsupervised auxiliary visual words discovery for large-scale image object retrieval. In: Computer Vision and Pattern Recognition (CVPR 2011), IEEE, 905-912.
Lai, Y.H. and Yang, C.K. (2014) Video object retrieval by trajectory and appearance. IEEE Transactions on Circuits and Systems for Video Technology, 25(6): 1026-1037.
Li, X., Larson, M. and Hanjalic, A. (2015) Pairwise geometric matching for large-scale object retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5153-5161.
Li, Z.N., Za¨ıane, O.R. and Tauber, Z. (1999) Illumination invariance and object model in content-based image and video retrieval. Journal of Visual Communication and Image Representation, 10(3): 219-244.
Liang-qun, L., Xi-yang, Z., Zong-Xiang, L. and Wei-Xin, X. (2018) Fuzzy logic approach to visual multi-object tracking. Neurocomputing, 281, 139-151.
Lin, Z. and Brandt, J. (2010) A local bag-of-features model for large-scale object retrieval. In: Proceedings of European Conference on Computer vision, Springer, 294-308.
Lou, Y., Bai, Y., Lin, J., Wang, S., Chen, J., Chandrasekhar, V., Duan, L.Y., Huang, T., Kot, A.C. and Gao, W. (2017) Compact deep invariant descriptors for video retrieval. In: Proceedings of Data Compression Conference, IEEE, 420-429.
Ma, X., Huang, H., Cai, Z., Wang, C. and Zou, Y. (2010) Video Object Retrieval Based on Color Feature Modeling. In: Proceedings of International Conference on Machine Vision and Human-machine Interface, IEEE, 101-104.
Mezaris, V., Kompatsiaris, I. and Strintzis, M.G. (2004) Regionbased image retrieval using an object ontology and relevance feedback. EURASIP Journal on Advances in Signal Processing, 6, 231-946.
Mitrea, C.A., Mironicˆa, I., Ionescu, B. and Dogaru, R. (2014) Multiple instance-based object retrieval in video surveillance Dataset and evaluation. In: Proceedings of IEEE 10th International Conference on Intelligent Computer Communication and Processing, 171-178.
Padmakala, S., AnandhaMala, G.S. and Shalini, M. (2011) An effective content based video retrieval utilizing texture, color and optimal key frame features. In: 2011 IEEE International Conference on Image Information Processing, Shimla, India, 1-6.
Pang, S., Ma, J., Zhu, J., Xue, J. and Tian, Q. (2019) Improving object retrieval quality by integration of similarity propagation and query expansion. IEEE Transactions on Multimedia, 21(3): 760-770.
Petkovic, M. and Jonker, W. (2001) Content-based video retrieval by integrating spatiotemporal and stochastic recognition of events. In: Proceedings of IEEE Workshop on Detection and Recognition of Events in Video, IEEE, 75-82.
Phalke, D.A. and Jahirbadkar, S. (2018) Systematic Review of Near Duplicate Video Retrieval Techniques. International Journal of Pure and Applied Mathematics, 118(24): 1-11.
Priyaa, D.S. and Karthikeyan, S. (2013) An Innovative Approach for Video Object Retrieval based on Color and Shape Using Intuitionistic Fuzzy Hausdorff Distance. International Journal of Computer Technology and Applications, 4(4): 710.
Qin, D., Gammeter, S., Bossard, L., Quack, T. and Van Gool, L. (2011) Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors. In: Proceedings of IEEE CVPR, 777-784.
Qin, D., Wengert, C. and Van Gool, L. (2013) Query adaptive similarity for large scale object retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1610-1617.
Ren, W., Singh, S., Singh, M. and Zhu, Y.S. (2009) State-of-the art on spatio-temporal information-based video retrieval. Pattern Recognition, 42(2): 267-282.
Sasithradevi, A., Roomi, S.M.M. and Maragatham, G. (2017) Content based video retrieval via object based approach. In: TENCON 2017- 2017 IEEE Region 10 Conference, 781-787.
Shan, C. (2010) Face recognition and retrieval in video. In: Video Search and Mining, 287, 235-260.
Sivic, J. and Zisserman, A. (2003) Video Google: A text retrieval approach to object matching in videos. Proceedings of Ninth IEEE International Conference on Computer Vision, IEEE, 1470.
Sivic, J. and Zisserman, A. (2004) Efficient visual content retrieval and mining in videos. In: Pacific-Rim Conference on Multimedia, Springer, 471-478.
Sivic, J. and Zisserman, A. (2008) Efficient visual search for objects in videos. In: Proceedings of the IEEE, 96(4): 548-566.
Sivic, J., Everingham, M. and Zisserman, A. (2005) Person spotting: video shot retrieval for face sets. In: International Conference on Image and Video Retrieval, Springer, 226-236.
Sivic, J., Schaffalitzky, F. and Zisserman, A. (2004) Object level grouping for video shots. In: European Conference on Computer Vision, Springer, 85-98.
Sivic, J., Schaffalitzky, F., and Zisserman, A. (2006) Object level grouping for video shots. International Journal of Computer Vision, 67(2): 189-210.
Smeaton, A.F. and Browne, P. (2006) A usage study of retrieval modalities for video shot retrieval. Information Processing & Management, 42(5): 1330-1344.
Song, J., Yang, Y., Huang, Z., Shen, H.T. and Hong, R. (2011) Multiple feature hashing for real-time large scale near-duplicate video retrieval. In: Proceedings of the 19th ACM International Conference on Multimedia, 423-432.
Tang, W., Cai, R., Li, Z. and Zhang, L. (2011) Contextual synonym dictionary for visual object retrieval. In: Proceedings of the 19th ACM International Conference on Multimedia, ACM, 503-512.
Tolias, G., Sicre, R. and J´egou, H. (2015) Particular object retrieval with integralmax-pooling of CNN activations. arXiv preprint arXiv:1511.05879
Valdés, V. and Mart´ınez, J.M. (2011) Efficient video summarization and retrieval tools. In: 9th International Workshop on Content-Based Multimedia Indexing (CBMI), IEEE, 43-48.
Van Den Hengel, A., Dick, A., Thormaehlen, A., Ward, T.B. and Torr, P.H. (2007) VideoTrace: rapid interactive scene modelling from video. ACM Transactions on Graphics (ToG), 26(3).
Visser, R., Sebe, N. and Bakker, E. (2002) Object recognition for video retrieval. In: International Conference on Image and Video Retrieval, Springer, 262-270.
Yan, R. and Hauptmann, A.G. (2007) A review of text and image retrieval approaches for broadcast news video. Information Retrieval, 10: 4-5, 445-484.
Yang, H., Qu, S., Zhu, F. and Zheng, Z (2018) Robust objectness tracking with weighted multiple instance learning algorithm. Neurocomputing, 288: 43-53.
Yang, L., Geng, B., Cai, Y., Hanjalic, A. and Hua, X.S. (2011) Object retrieval using visual query context. IEEE Transactions on Multimedia, 13(6): 1295-1307.
Yang, Y., Fleites, F.C., Wang, H. and Chen, S.C. (2013) An automatic object retrieval framework for complex background. In: Proceedings of IEEE International Symposium on Multimedia, 374-377.
Yeo, B.L. and Yeung, M.M. (1997) Retrieving and visualizing video. Communications of the ACM, 40(12): 43-53.
Zhang, Y., Jiang, F., Rho, S., Liu, S., Zhao, D. and Ji, R. (2016a) 3D object retrieval with multi-feature collaboration and bipartite graphmatching. Neurocomputing, 195, 40-49.
Zhang, H., Cao, X., Ho, J.K.L and Chow, T.W.S. (2016b) Object Level Video Advertising: An Optimization Framework. IEEE Transactions on industrial informatics, 13(6): 1295-1307.
Zhang, D., Han, J., Jiang, L., Ye, S. and Chang, X. (2017) Revealing event saliency in unconstrained video collection. IEEE Transactions on Image Processing, 26(4):1746-1758.
Zhang, H., Ji, Y., Huang, W. and Liu, L. (2018) Sitcom-star-based clothing retrieval for video advertising: a deep learning framework. Neural Computing and Applications, 13(6): 1295-1307.
Zhang, N. and Jeong, H.Y. (2017) A retrieval algorithm for specific face images in airport surveillance multimedia videos on cloud computing platform. Multimedia Tools and Applications, 76(16): 17129-17143.
Zhao, F., Huang, Y., Wang, L. and Tan, T. (2015) Deep semantic ranking based hashing for multi-label image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1556-1564.
Zhao, S., Precioso, F. and Cord, M. (2011) Spatio-Temporal Tube data representation and Kernel design for SVM-based video object retrieval system. Multimedia Tools and Applications, 55(1): 105-125.
Zhao, S., Yao, H., Zhang, Y., Wang, Y. and Liu, S. (2015) View based 3D objects retrieval via multi-modal graph learning. Signal Processing, 112, 110-118.
Zhu, L., Shen, J., Xie, L. and Cheng, Z. (2016) Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Transactions on Knowledge and Data Engineering, 29(2): 472-486.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-1b496700-f51f-4d46-95df-b4b4fb799a07