HTTNet: hybrid transformer-based approaches for trajectory prediction

Ge, Xianlei; Sen, Xiaobo; Zhou, Xuanxin; Li, Xiaoyan

doi:10.24425/bpasts.2024.150811

Artykuł - szczegóły

Tytuł artykułu

HTTNet: hybrid transformer-based approaches for trajectory prediction

Autorzy

Ge Xianlei , Sen Xiaobo , Zhou Xuanxin , Li Xiaoyan

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.24425/bpasts.2024.150811

Warianty tytułu

Języki publikacji

Abstrakty

Forecasting future trajectories of intelligent agents presents a formidable challenge, necessitating the analysis of intricate scenarios and uncertainties arising from agent interactions. Consequently, it is judicious to contemplate the establishment of inter-agent relationships and the assimilation of contextual semantic information. In this manuscript, we introduce HTTNet, a comprehensive framework that spans three dimensions of information modeling: (1) the temporal dimension, where HTTNet employs a time encoder to articulate time sequences, comprehending the influences of past and future trajectories; (2) the social dimension, where the trajectory encoder facilitates the input of trajectories from multiple agents, thereby streamlining the modeling of interaction information among intelligent agents; (3) the contextual dimension, where the TF-map encoder integrates semantic scene input, amplifying HTTNet cognitive grasp of scene information. Furthermore, HTTNet integrates a hybrid modeling paradigm featuring CNN and transformer, transmuting map scenes into feature information for the transformer. Qualitative and quantitative analyses on the nuScenes and interaction datasets highlight the exceptional performance of HTTNet, achieving 1.03 minADE10 and a 0.31 miss rate on nuScenes, underscoring its effectiveness in multi-agent trajectory prediction in complex scenarios.

Słowa kluczowe

trajectory prediction transformer convolutional neural network multimodal data

przewidywanie trajektorii transformator sieć neuronowa splotowa dane multimodalne

Wydawca

Polska Akademia Nauk, Wydział IV Nauk Technicznych

Czasopismo

Bulletin of the Polish Academy of Sciences. Technical Sciences

Rocznik

2024

Tom

Vol. 72, nr 5

Strony

art. no. e150811

Opis fizyczny

Bibliogr. 38 poz., rys., tab.

Twórcy

autor

Ge Xianlei

School of Electronic Engineering, Huainan Normal University, China
College of Computing and Information Technologies, National University, Philippines

https://orcid.org/0000-0002-9353-5199

autor

Sen Xiaobo

shenxb@hnnu.edu.cn

School of Electronic Engineering, Huainan Normal University, China
College of Industrial Education, Technological University of the Philippines, Philippines

autor

Zhou Xuanxin

School of Electronic Engineering, Huainan Normal University, China

https://orcid.org/0009-0004-9818-537X

autor

Li Xiaoyan

School of Computer, Huainan Normal University, China
College of Computing and Information Technologies, National University, Philippines

https://orcid.org/0000-0001-5286-671X

Bibliografia

[1] H. Cui et al., “Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks,” in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 2090–2096, doi: 10.1109/ICRA.2019.8793868.
[2] F. Leon and M. Gavrilescu, “A review of tracking and trajectory prediction methods for autonomous driving,” Mathematics, vol. 9, no. 6, p. 660, 2021, doi: 10.3390/math9060660.
[3] Z. Sheng, Y. Xu, S. Xue, and D. Li, “Graph-Based Spatial-Temporal Convolutional Network for Vehicle Trajectory Prediction in Autonomous Driving,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 10, pp. 17654–17665, Oct. 2022, doi: 10.1109/TITS.2022.3155749.
[4] Y. Zhang, W. Wang, R. Bonatti, D. Maturana, and S. Scherer, “Integrating kinematics and environment context into deep inverse reinforcement learning for predicting off-road vehicle trajectories,” in Proceedings of Machine Learning Research, ML Research Press, 2018, pp. 894–905.
[5] S.J. Qiao, N. Han, X.W. Zhu, H.P. Shu, J.L. Zheng, and C.A. Yuan, “A Dynamic Trajectory Prediction Algorithm Based on Kalman Filter,” Tien Tzu Hsueh Pao/Acta Electronica Sinica, vol. 46, no. 2, pp. 418–423, 2018, doi: 10.3969/j.issn.0372-2112.2018.02.022.
[6] H. Rong, A.P. Teixeira, and C. Guedes Soares, “Ship trajectory uncertainty prediction based on a Gaussian Process model,” Ocean Eng., vol. 182, pp. 499–5112019, doi: 10.1016/j.oceaneng.2019.04.024.
[7] S. Becker, R. Hug, W. Hübner, and M. Arens, “An Evaluation of Trajectory Prediction Approaches and Notes on the TrajNet Benchmark.” arXiv:1805.07663, 2018, doi: 10.48550/arXiv.1805.07663.
[8] M.M. Kordmahalleh, M.G. Sefidmazgi, and A. Homaifar, “A sparse recurrent neural network for trajectory prediction of atlantic hurricanes,” GECCO 2016 – 2016 Genetic and Evolutionary Computation Conference, 2016, pp. 957–964, doi: 10.1145/2908812.2908834.
[9] E. Lukasik et al., “Recognition of handwritten Latin characters with diacritics using CNN,” Bull. Pol. Acad. Sci. Tech. Sci., vol. 69, no. 1, pp. e136210, 2021, doi: 10.24425/bpasts.2020.136210.
[10] J. Wróbel and A. Kulawik, “Influence of modelling phase transformations with the use of LSTM network on the accuracy of computations of residual stresses for the hardening process,” Bull. Pol. Acad. Sci. Tech. Sci., vol. 71, no. 4, pp. e145681, 2023, doi: 10.24425/bpasts.2023.145681.
[11] A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, “Social LSTM: Human trajectory prediction in crowded spaces,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016, pp. 961–971, doi: 10.1109/CVPR.2016.110.
[12] A. Vaswani et al., “Attention is all you need,” in Advances in Neural Information Processing Systems, 2017.
[13] Y. Yuan, X. Weng, Y. Ou, and K. Kitani, “AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting,” 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, Canada, 2021, pp. 9793–9803, doi: 10.1109/ICCV48922.2021.00967.
[14] A. Graves, “Supervised Sequence Labelling,” in Supervised Sequence Labelling with Recurrent Neural Networks. Studies in Computational Intelligence, vol. 385. Springer, Berlin, Heidelberg, 2012, doi: 10.1007/978-3-642-24797-2_2.
[15] M. Abebe, Y. Noh, Y.J. Kang, C. Seo, D. Kim, and J. Seo, “Ship trajectory planning for collision avoidance using hybrid ARIMA-LSTM models,” Ocean Eng., vol. 256, p. 111527, 2022, doi: 10.1016/j.oceaneng.2022.111527.
[16] J. Yin, C. Ning, and T. Tang, “Data-driven models for train control dynamics in high-speed railways: LAG-LSTM for train trajectory prediction,” Inf. Sci., vol. 600, pp. 377–400, 2022, doi: 10.1016/j.ins.2022.04.004.
[17] N. Zhang, N. Zhang, Q. Zheng, and Y.S. Xu, “Real-time prediction of shield moving trajectory during tunnelling using GRU deep neural network,” Acta Geotech., vol. 17, no. 4, pp. 1167–1182, 2022, doi: 10.1007/s11440-021-01319-1.
[18] H. Xue, D.Q. Huynh, and M. Reynolds, “PoPPL: Pedestrian Trajectory Prediction by LSTM with Automatic Route Class Clustering,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 1, pp. 77–90, 2021, doi: 10.1109/TNNLS.2020.2975837.
[19] J. Gao et al., “VectorNet: Encoding HD maps and agent dynamics from vectorized representation,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020, pp. 11522–11530, doi: 10.1109/CVPR42600.2020.01154.
[20] M. Liang et al., “Learning Lane Graph Representations for Motion Forecasting,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2020, doi: 10.1007/978-3-030-58536-5_32.
[21] Y. Zhu, D. Qian, D. Ren, and H. Xia, “StarNet: Pedestrian Trajectory Prediction using Deep Neural Network in Star Topology,” 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 2019, pp. 8075–8080, doi: 10.1109/IROS40897.2019.8967811.
[22] Y. Liu, J. Zhang, L. Fang, Q. Jiang, and B. Zhou, “Multimodal Motion Prediction with Stacked Transformers,” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, USA, 2021, pp. 7573–7582, doi: 10.1109/CVPR46437.2021.00749.
[23] L. Li, M. Pagnucco, and Y. Song, “Graph-based Spatial Transformer with Memory Replay for Multi-future Pedestrian Trajectory Prediction,” 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, USA, 2022, pp. 2221–2231, doi: 10.1109/CVPR52688.2022.00227.
[24] X. Jia, P. Wu, L. Chen, Y. Liu, H. Li, and J. Yan, “HDGT: Heterogeneous Driving Graph Transformer for Multi-Agent Trajectory Prediction via Scene Encoding,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 11, pp. 13860–13875, 2023, doi: 10.1109/TPAMI.2023.3298301.
[25] Mark Sandler, A. Howard, M. Zhu, A. Zhmoginov, and Liang-Chieh Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks Mark,” in Convolutional Neural Networks with Swift for Tensorflow, 2019.
[26] W. Zhan et al., “INTERACTION Dataset: An INTERnational, Adversarial and Cooperative moTION Dataset in Interactive Driving Scenarios with Semantic Maps,” CoRR arXiv: 1910.03088, 2019, doi: 10.48550/arXiv.1910.03088.
[27] H. Caesar et al., “Nuscenes: A multimodal dataset for autonomous driving,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020, pp. 11618–11628, doi: 10.1109/CVPR42600.2020.01164.
[28] K. He, X. Zhang, S. Ren, and J. Sun, “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification,” 2015 IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1026–1034. doi: 10.1109/ICCV.2015.123.
[29] T. Salzmann, B. Ivanovic, P. Chakravarty, and M. Pavone, “Trajectron++: Dynamically-Feasible Trajectory Forecasting with Heterogeneous Data,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2020, doi: 10.1007/978-3-030-58523-5_40.
[30] N. Deo and M.M. Trivedi, “Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans,” CoRR arXiv: 2001.00735, 2020, doi: 10.48550/arXiv.2001.00735.
[31] B. Do Kim et al., “Lapred: Lane-aware prediction of multi-modal future trajectories of dynamic agents,” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, USA, 2021, pp. 14631–14640, doi: 10.1109/CVPR46437.2021.01440.
[32] Y. Chai, B. Sapp, M. Bansal, and D. Anguelov, “MultiPath: Multiple Probabilistic Anchor Trajectory Hypotheses for Behavior Prediction,” in Proceedings of the Conference on Robot Learning, 2020, vol. 100, pp. 86–99.
[33] T. Gilles, S. Sabatini, D. Tsishkou, B. Stanciulescu, and F. Moutarde, “GOHOME: Graph-Oriented Heatmap Output for future Motion Estimation,” 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, USA, 2022, pp. 9107–9114, doi: 10.1109/ICRA46639.2022.9812253.
[34] T. Gilles, S. Sabatini, D. Tsishkou, B. Stanciulescu, and F. Moutarde, “THOMAS: Trajectory Heatmap Output with learned Multi-Agent Sampling,” CoRR arXiv:2110.06607, 2021, doi: 10.48550/arXiv.2110.06607.
[35] N. Deo, E. Wolff, and O. Beĳbom, “Multimodal Trajectory Prediction Conditioned on Lane-Graph Traversals,” 5th Annual Conference on Robot Learning, 2021. [Online]. Available: https://openreview.net/forum?id=hu7b7MPCqiC
[36] N. Lee, W. Choi, P. Vernaza, C.B. Choy, P.H.S. Torr, and M. Chandraker, “DESIRE: Distant future prediction in dynamic scenes with interacting agents,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, 2017, pp. 2165–2174, doi: 10.1109/CVPR.2017.233.
[37] H. Zhao et al., “TNT: Target-driveN Trajectory Prediction.” 2020.
[38] A. Scibior, V. Lioutas, D. Reda, P. Bateni, and F. Wood, “Imagining the Road Ahead: Multi-Agent Trajectory Prediction via Differentiable Simulation,” 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, USA, 2021, pp. 720–725, doi: 10.1109/ITSC48978.2021.9565113.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-04695aa6-31fc-4149-b414-fd23b4c240e2