A strong and efficient baseline for vehicle re-identification using deep triplet embedding

Kumar, Ratnesh; Weill, Edwin; Aghdasi, Farzin; Sriram, Parthasarathy

doi:10.2478/jaiscr-2020-0003

Artykuł - szczegóły

Tytuł artykułu

A strong and efficient baseline for vehicle re-identification using deep triplet embedding

Autorzy

Kumar Ratnesh , Weill Edwin , Aghdasi Farzin , Sriram Parthasarathy

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.2478/jaiscr-2020-0003

Warianty tytułu

Języki publikacji

Abstrakty

In this paper we tackle the problem of vehicle re-identification in a camera network utilizing triplet embeddings. Re-identification is the problem of matching appearances of objects across different cameras. With the proliferation of surveillance cameras enabling smart and safer cities, there is an ever-increasing need to re-identify vehicles across cameras. Typical challenges arising in smart city scenarios include variations of viewpoints, illumination and self occlusions. Most successful approaches for re-identification involve (deep) learning an embedding space such that the vehicles of same identities are projected closer to one another, compared to the vehicles representing different identities. Popular loss functions for learning an embedding (space) include contrastive or triplet loss. In this paper we provide an extensive evaluation of triplet loss applied to vehicle re-identification and demonstrate that using the recently proposed sampling approaches for mining informative data points outperform most of the existing state-of-the-art approaches for vehicle re-identification. Compared to most existing state-of-the-art approaches, our approach is simpler and more straightforward for training utilizing only identity-level annotations, along with one of the smallest published embedding dimensions for efficient inference. Furthermore in this work we introduce a formal evaluation of a triplet sampling variant (batch sample) into the re-identification literature. In addition to the conference version [24], this submission adds extensive experiments on new released datasets, cross domain evaluations and ablation studies.

Słowa kluczowe

convolutional neural networks re-identification triplet networks siamese networks embedding hard data mining contrastive loss

konwolucyjne sieci neuronowe sieci triplet sieci syjamskie osadzanie eksploracja danych

Wydawca

University of Social Sciences

Czasopismo

Journal of Artificial Intelligence and Soft Computing Research

Rocznik

2020

Tom

Vol. 10, No. 1

Strony

27--45

Opis fizyczny

Bibliogr. 62 poz., rys.

Twórcy

autor

Kumar Ratnesh

ratneshk@nvidia.com

NVIDIA

autor

Weill Edwin

eweill@nvidia.com

NVIDIA

autor

Aghdasi Farzin

NVIDIA

autor

Sriram Parthasarathy

NVIDIA

Bibliografia

[1] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane,´ R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viegas, ´ O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.
[2] Y. Bai, Y. Lou, F. Gao, S. Wang, Y. Wu, and L. Duan. Group Sensitive Triplet Embedding for Vehicle Reidentification. IEEE Transactions on Multimedia, 2018.
[3] S. Bak, M. S. Biagio, R. Kumar, V. Murino, and F. Bremond. Exploiting Feature Correlations by Brownian Statistics for People Detection and Recognition. IEEE Transactions on Systems, Man, and Cybernetics, 2017.
[4] J. Bromley, J. W. Bentz, L. Bottou, I. Guyon, Y. Lecun, C. Moore, E. Sackinger, and R. Shah. Signature ¨Verification Using a “Siamese” Time Delay Neural Network. International Journal of Pattern Recognition and Artificial Intelligence 1993.
[5] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similiarty metric discriminatively, with application to face verification. In CVPR, 2005.
[6] M. Farenzena, L. Bazzani, A. Perina, V. Murino, and M. Cristani, Person Re-Identication by SymmetryDriven Accumulation of Local Features. In CVPR, 2010.
[7] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative Adversarial Networks. In NIPS, 2014.
[8] H. Z. Gu and S. Y. Lee. Car model recognition by utilizing symmetric property to overcome severe pose variation. Machine Vision and Applications, 2013.
[9] H. Guo, C. Zhao, Z. Liu, J. Wang, and H. Lu. Learning Coarse-to-Fine Structured Feature Embedding for Vehicle Re-Identification. In AAAI, 2018.
[10] K. He, R. B. Girshick, and P. Dollar. Rethinking ´ imagenet pre-training. CoRR’18
[11] K. He, X. Zhang, S. Ren, and J. Sun. Deep Residual Learning for Image Recognition. In CVPR, 2016.
[12] A. Hermans, L. Beyer, and B. Leibe. In Defense of the Triplet Loss for Person Re-Identification. In CoRR, 2017.
[13] S. Hochreiter and J. Schmidhuber. Long ShortTerm Memory. Neural Computation, 1997.
[14] E. Hoffer and N. Ailon. Deep metric learning using triplet network. In ICLR Workshops, 2015.
[15] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam.MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. In CVPR, 2017.
[16] Q. Hu, H. Wang, T. Li, and C. Shen. Deep CNNs with Spatially Weighted Pooling for Fine-Grained Car Recognition. IEEE Transactions on Intelligent Transportation Systems, 2017.
[17] V. Jain, Z. Sasindran, A. Rajagopal, S. Biswas, H. S. Bharadwaj, and K. R. Ramakrishnan. Deep automatic license plate recognition system. ICVGIP, 2016.
[18] Jia Deng, Wei Dong, R. Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A large-scale hierarchical image database. In CVPR, 2009.
[19] A. Kanaci, X. Zhu, and S. Gong. Vehicle ReIdentification by Fine-Grained Cross-Level Deep Learning. In BMVC, 2017.
[20] A. Kanaci, X. Zhu, and S. Gong. Vehicle reidentification in context. In Pattern Recognition -40th German Conference, GCPR 2018, Stuttgart, Germany, September 10-12, 2018, Proceedings, 2018.
[21] P. Khorramshahi, A. Kumar, N. Peri, S. S. Rambhatla, J.-C. Chen, and R. Chellappa. A dualpath model with adaptive attention for vehicle reidentification. In ICCV’19
[22] D. P. Kingma and J. Ba. Adam: A Method for Stochastic Optimization. In ICLR, 2015.
[23] R. Kumar, G. Charpiat, and M. Thonnat. Multiple Object Tracking by Efficient Graph Partitioning. In ACCV’14
[24] R. Kumar, E. Weill, F. Aghdasi, and P. Sriram. Vehicle re-identification: an efficient baseline using triplet embedding. In IJCNN’19
[25] L. Liao, R. Hu, J. Xiao, Q. Wang, J. Xiao, and J. Chen. Exploiting effects of parts in fine-grained categorization of vehicles. In ICIP, 2015.
[26] S. Liao, Y. Hu, X. Zhu, and S. Z. Li. Person reidentification by Local Maximal Occurrence representation and metric learning. In CVPR, 2015.
[27] Y. L. Lin, V. I. Morariu, W. Hsu, and L. S. Davis. Jointly optimizing 3D model fitting and fine-grained classification. In ECCV, 2014.
[28] H. Liu, Y. Tian, Y. Wang, L. Pang, and T. Huang. Deep Relative Distance Learning: Tell the Difference Between Similar Vehicles. In CVPR, 2016.
[29] X. Liu, W. Liu, H. Ma, and H. Fu. Large-scale vehicle re-identification in urban surveillance videos. ICME, 2016.
[30] X. Liu, H. Ma, H. Fu, and M. Zhou. Vehicle Retrieval and Trajectory Inference in Urban Traffic Surveillance Scene. In ICDSC, 2014.
[31] Y. Lou, Y. Bai, J. Liu, S. Wang, and L.-Y. Duan. A large-scale dataset for vehicle re-identification in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.
[32] P. Luo, C. C. Loy, X. Tang, L. Yang, P. Luo, C. C. Loy, and X. Tang. A Large-Scale Car Dataset for Fine-Grained Categorization and Verification. In CVPR, 2015.
[33] B. Ma, Y. Su, F. Jurie, B. Ma, Y. Su, and F. Jurie. Local Descriptors Encoded by Fisher Vectors for Person Re-identification. In ECCV Workshops, 2012.
[34] B. P. Ma, Y. Su, and F. Jurie. BiCov: a novel image representation for person re-identification and face verification. In BMVC, 2012.
[35] R. Manmatha, C. Y. Wu, A. J. Smola, and P. Krahenbuhl. Sampling Matters in Deep Embedding Learning. In CVPR, 2017.
[36] A. Mishchuk, D. Mishkin, F. Radenovic, and J. Matas. Working hard to know your neighbor’s margins: Local descriptor learning loss. In NIPS, 2017.
[37] V. Mnih, N. Heess, A. Graves, and K. Kavukcuoglu. Recurrent Models of Visual Attention. In NIPS, 2014.
[38] M. Naphade, M.-C. Chang, A. Sharma, C. Anastasiu, David, V. Jagarlamudi, P. Chakraborty, T. Huang, S. Wang, M. Y. Liu, R. Chellappa, J.-N. Hwang, and S. Lyu. The 2018 NVIDIA AI City Challenge. CVPR Workshops, 2018.
[39] O. Rippel, M. Paluri, P. Dollar, and L. Bourdev. Metric Learning with Adaptive Density Discrimination. In ICLR, 2016.
[40] E. Ristani and C. Tomasi. Features for Multi-Target Multi-Camera Tracking and Re-Identification. In CVPR, 2018.
[41] F. Schroff, D. Kalenichenko, and J. Philbin. FaceNet: A unified embedding for face recognition and clustering. In CVPR, 2015.
[42] L. Shen, Z. Lin, and Q. Huang. Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks. In ECCV, 2016.
[43] Y. Shen, T. Xiao, H. Li, S. Yi, and X. Wang. Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-Temporal Path Proposals. ICCV, 2017.
[44] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR’14
[45] K. Simonyan and A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. In ICLR, 2015.
[46] J. Sochor, A. Herout, and J. Havel. BoxCars: 3D Boxes as CNN Input for Improved Fine-Grained Vehicle Recognition. In CVPR, 2016.
[47] J. Spanhel, J. Sochor, R. Juranek, A. Herout, L. Marsik, and P. Zemcik. Holistic recognition of low quality license plates by CNN using track annotated data. AVSS, 2017.
[48] S. Tang, M. Andriluka, B. Andres, and B. Schiele. Multiple people tracking by lifted multicut and person re-identification. In CVPR, 2017.
[49] Z. Tang, M. Naphade, S. Birchfield, J. Tremblay, W. Hodge, R. Kumar, S. Wang, and X. Yang. Pamtri: Pose-aware multi-task learning for vehicle re-identification using highly randomized synthetic data. In ICCV’19
[50] Z. Tang, M. Naphade, M.-Y. Liu, X. Yang, S. Birchfield, S. Wang, R. Kumar, D. Anastasiu, and J.-N. Hwang. Cityflow: A city-scale benchmark for multi-target multi-camera vehicle tracking and reidentification. In CVPR’19
[51] O. Tuzel, F. Porikli, and P. Meer. Region covariance: A fast descriptor for detection and classification. In European Conference on Computer Vision, pages 589–600, 2006.
[52] Y. Wang, L. Xie, S. Qiao, Y. Zhang, W. Zhang, and A. L. Yuille. A Deep Learning-Based Approach to Progressive Vehicle Re-identification for Urban Surveillance. In ECCV, 2016.
[53] Z. Wang, L. Tang, X. Liu, Z. Yao, S. Yi, J. Shao, J. Yan, S. Wang, H. Li, and X. Wang. Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification. In ICCV, 2017.
[54] K. Q. Weinberger and L. K. Saul. Distance Metric Learning for Large Margin Nearest Neighbor Classification. The Journal of Machine Learning Research, 10:207–244, 2009.
[55] L. Wen, D. Du, Z. Cai, Z. Lei, M. Chang, H. Qi, J. Lim, M. Yang, and S. Lyu. UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking.
[56] N. Wojke and A. Bewley. Deep Cosine Metric Learning for Person Re-identification. In WACV, 2018.
[57] T. Xiao, H. Li, W. Ouyang, and X. Wang. Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification. In CVPR, 2016.
[58] K. Yan, Y. Tian, Y. Wang, W. Zeng, and T. Huang. Exploiting Multi-grain Ranking Constraints for Precisely Searching Visually-similar Vehicles. In ICCV, 2017.
[59] D. Zapletal, A. Herout, and A. Herout. Vehicle ReIdentification for Automatic Video Traffic Surveillance. In CVPR Workshops, 2016.
[60] L. Zhang, T. Xiang, and S. Gong. Learning a Discriminative Null Space for Person Re-identification. In CVPR, 2016.
[61] Y. Zhou and L. Shao. Vehicle Re-Identification by Adversarial Bi-Directional LSTM Network. In WACV, 2018.
[62] Y. Zhou and L. Shao. Viewpoint-aware Attentive Multi-view Inference for Vehicle Re-identification. In CVPR, 2018.

Uwagi

Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2020).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-ba9555af-75ed-4d81-9b5d-2ff72f4e648e