Object pose estimation in monocular image using modified FDCM

Dabbour, Abd Alrazzak; Habib, Rabie; Saii, Mariam

doi:10.7494/csci.2020.21.1.3426

Artykuł - szczegóły

Tytuł artykułu

Object pose estimation in monocular image using modified FDCM

Autorzy

Dabbour Abd Alrazzak , Habib Rabie , Saii Mariam

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.7494/csci.2020.21.1.3426

Warianty tytułu

Języki publikacji

Abstrakty

In this paper, a new method for multi-object detection and pose estimation in a monocular image is proposed based on the FDCM method. This method can detect an object with a high-speed running time even if the object was under partial occlusion or bad illumination. Additionally, it only requires a single template without any training process. In this paper, a new method (MFDCM) for 3D multi-object pose estimation in a monocular image is proposed, which is based on the FDCM method with major performance improvements in accuracy and running time. These improvements were achieved by using the LSD method instead of a simple edge detector (Canny detector), using an angular Voronoi diagram instead of calculating the 3D distance transform image, a distance transform image, and an integral distance transform image at each orientation. In addition, the search process in the proposed method depends on a line segment-based search instead of the sliding window search in the FDCM. As a result, the proposed method is more robust and much faster than the FDCM method, and the position, scale, and rotation are invariant. In addition, the proposed method was evaluated and compared to different methods (COF, HALCON, LINE2D, and BOLD) using a D-textureless dataset. The comparison results show that the MFDCM has the highest score among all of the tested methods (with a slight advantage from the COF and BLOD methods) while it was a little slower than LINE2D (which was the fasted method among the compared methods). Furthermore, it was at least 14-times faster than the FDCM in the tested scenarios. The results prove that the MFDCM is able to detect and 3D pose estimate of object in a clear or clustered background from a monocular image with a high-speed running time, even if the objects are under partial occlusion; this makes it robust and reliable for real-time applications.

Słowa kluczowe

3DOF pose estimation FDCM monocular image Voronoi diagram line-based matching LSD

Wydawca

Wydawnictwa AGH

Czasopismo

Computer Science

Rocznik

2020

Tom

T. 21 (1)

Strony

97--112

Opis fizyczny

Bibliogr. 43 poz., rys., tab.

Twórcy

autor

Dabbour Abd Alrazzak

abddabbour@outlook.com

Tishreen University, Department of Mechatronics Engineering, Lattakia, Syria

autor

Habib Rabie

Rabie.habib967@yahoo.com

Tishreen University, Department of Mechatronics Engineering, Lattakia, Syria

autor

Saii Mariam

dr.mariam.saii@gmail.com

Tishreen University, Department of Computer and Automatic Control Engineering, Lattakia, Syria

Bibliografia

[1] Barrois B., W¨ohler C.: 3D pose estimation based on multiple monocular cues. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2007. https://doi.org/10.1109/CVPR.2007.383352.
[2] Brachmann E., Michel F., Krull A., Yang M.Y., Gumhold S., Rother C.: Uncertainty-Driven 6D Pose Estimation of Objects and Scenes from a Single RGB Image. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 49, pp. 3364–3372. IEEE, 2016. https://doi.org/10.1109/ CVPR.2016.366.
[3] Bratanic B., Pernus F., Likar B., Tomazevic D.: Real-time pose estimation of rigid objects in heavily cluttered environments, Computer Vision and Image Understanding, vol. 141, pp. 38–51, 2015. https://doi.org/10.1016/j.cviu.2015.09.002.
[4] Cai H., Werner T., Matas J.: Fast Detection of Multiple Textureless 3-D Objects. In: Computer Vision Systems. ICVS 2013. Lecture Notes in Computer Science, vol. 7963, Springer, Berlin, Heidelberg, pp. 103–112. 2013. https://doi.org/10. 1007/978-3-642-39402-7 11.
[5] Damen D., Bunnun P., Calway A., Mayol-Cuevas W.: Real-time learning and detection of 3D texture-less objects: A scalable approach. In: BMVC 2012 – Electronic Proceedings of the British Machine Vision Conference 2012, pp. 1–12, 2012. https://doi.org/10.5244/C.26.23.
[6] Do T.T., Cai M., Pham T., Reid I.: Deep-6DPose: Recovering 6D Object Pose from a Single RGB Image, 2018. http://arxiv.org/abs/1802.10367.
[7] Fan B., Wu F., Hu Z.: Aggregating gradient distributions into intensity orders: A novel local image descriptor. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2377–2384, 2011. https://doi.org/10.1109/CVPR.2011.5995385.
[8] Fan B., Wu F., Hu Z.: Rotationally invariant descriptors using intensity order pooling, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34(10), pp. 2031–2045, 2012. https://doi.org/10.1109/TPAMI.2011.277.
[9] Galvez-Lopez D., Salas M., Tardos J.D., Montiel J.M.: Real-time monocular object SLAM. In: Robotics and Autonomous Systems, vol. 75, pp. 435–449, 2016. https://doi.org/10.1016/j.robot.2015.08.009.
[10] Grompone Von Gioi R., Jakubowicz J., Morel J.M., Randall G.: LSD: a Line Segment Detector Rafael, Image Processing On Line, vol. 2, pp. 35–55, 2012. https://doi.org/10.5201/ipol.2012.gjmr-lsd.
[11] Hinterstoisser S., Cagniart C., Ilic S., Sturm P., Navab N., Fua P., Lepetit V.: Gradient response maps for real-time detection of textureless objects, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34(5), pp. 876–888, 2012. https://doi.org/10.1109/TPAMI.2011.206.
[12] Hinterstoisser S., Holzer S., Cagniart C., Ilic S., Konolige K., Navab N., Lepetit V.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 858–865, 2011. https://doi.org/10.1109/ICCV. 2011.6126326.
[13] Hinterstoisser S., Lepetit V., Ilic S., Holzer S., Bradski G., Konolige K., Navab N.: Model Based Training, Detection and Pose Estimation of TextureLess 3D Objects in Heavily Cluttered Scenes. In: Lee K.M., Matsushita Y., Rehg J.M., Hu Z. (eds.) Computer Vision – ACCV 2012, pp. 548–562, 2013. https://doi.org/10.1007/978-3-642-37331-2 42.
[14] Imperoli M., Pretto A.: D2CO: Fast and Robust Registration of 3D Textureless Objects Using the Directional Chamfer Distance. In: Nalpantidis L., Kr¨uger V., Eklundh JO., Gasteratos A. (eds.), Computer Vision Systems. ICVS 2015. Lecture Notes in Computer Science, vol. 9163, pp. 316–328, Springer, Cham, 2015. https://doi.org/10.1007/978-3-319-20904-3 29.
[15] Imperoli M., Pretto A.: Active Detection and Localization of Textureless Objects in Cluttered Environments, 2016. http://arxiv.org/abs/1603.07022.
[16] James S., Collomosse J.: Interactive video asset retrieval using sketched queries. In: CVMP ’14: Proceedings of the 11th European Conference on Visual Media Production, pp. 1–8, 2014. https://doi.org/10.1145/2668904.2668940.
[17] Jaynes C., Hou J.: Temporal Registration using a Kalman Filter for Augmented Reality Applications. In: Proceedings of Vision Iterface Conference Journal, 2000.
[18] Kehl W., Manhardt F., Tombari F., Ilic S., Navab N.: SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again. In: Proceedings of the IEEE International Conference on Computer Vision, 2017 October, pp. 1530–1538. IEEE, 2017. https://doi.org/10.1109/ICCV.2017.169.
[19] Konishi Y., Hanzawa Y., Kawade M., Hashimoto M.: Fast 6D Pose Estimation from a Monocular Image Using Hierarchical Pose Trees. In: Leibe B., Matas J., Sebe N., Welling M. (eds.), Computer Vision – ECCV 2016. ECCV 2016. Lecture Notes in Computer Science, vol. 9905, pp. 398–413, Springer, Cham, 2016. https: //doi.org/10.1007/978-3-319-46448-0 24.
[20] Konishi Y., Ijiri Y., Suwa M., Kawade M.: Textureless object detection using cumulative orientation feature. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 1310–1313, 2015. https://doi.org/10.1109/ICIP.2015. 7351012.
[21] Liu M.Y., Tuzel O., Veeraraghavan A., Chellappa R.: Fast directional chamfer matching. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1696–1703. IEEE, 2010. https: //doi.org/10.1109/CVPR.2010.5539837.
[22] Liu M.Y., Tuzel O., Veeraraghavan A., Taguchi Y., Marks T.K., Chellappa R.: Fast object localization and pose estimation in heavy clutter for robotic bin picking, International Journal of Robotics Research, vol. 31(8), pp. 951–973, 2012. https://doi.org/10.1177/0278364911436018.
[23] Morwald T., Zillich M., Vincze M.: Edge tracking of textured objects with a recursive particle filter. In: 19th International Conference on Computer Graphics and Vision, GraphiCon’2009 – Conference Proceedings, pp. 96–103, 2009.
[24] Munoz E., Konishi Y., Beltran C., Murino V., Del Bue A.: Fast 6D pose from a single RGB image using Cascaded Forests Templates. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4062–4069, 2016. https://doi.org/10.1109/IROS.2016.7759598.
[25] Nagarajan B., Rajathilagam B.: Object Shapes from Regular Curves through Sparse Representations, Procedia Computer Science, vol. 133, pp. 635–642, 2018. https://doi.org/10.1016/j.procs.2018.07.098.
[26] Pavlakos G., Zhou X., Chan A., Derpanis K.G., Daniilidis K.: 6-DoF Object Pose from Semantic Keypoints. In: IEEE International Conference on Robotics and Automation (ICRA), 2017. https://doi.org/10.1109/ICRA.2017.7989233.
[27] Peng X.: Combine color and shape in real-time detection of texture-less objects, Computer Vision and Image Understanding, vol. 135, pp. 31–48, 2015. https: //doi.org/10.1016/j.cviu.2015.02.010.
[28] Phillips C.J., Lecce M., Daniilidis K.: Seeing glassware: From edge detection to pose estimation and shape recovery, Robotics: Science and Systems, vol. 12, 2016. https://doi.org/10.15607/rss.2016.xii.021.
[29] Rios-Cabrera R., Tuytelaars T.: Discriminatively trained templates for 3D object detection: A real time scalable approach. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2048–2055, 2013. https://doi.org/10. 1109/ICCV.2013.256.
[30] Rios-Cabrera R., Tuytelaars T.: Boosting masked dominant orientation templates for efficient object detection, Computer Vision and Image Understanding, vol. 120, pp. 103–116, 2014. https://doi.org/10.1016/j.cviu.2013.12.008.
[31] Steger C.: Similarity Measures for Occlusion, Clutter, and Illumination Invariant Object Recognition. In: Radig B., Florczyk S. (eds.), Pattern Recognition. DAGM 2001. Lecture Notes in Computer Science, vol. 2191, pp. 148–154, Springer, Berlin, Heidelberg, 2001. https://doi.org/10.1007/3-540-45404-7 20.
[32] Tejani A., Tang D., Kouskouridas R., Kim T.K.: Latent-Class Hough Forests for 3D Object Detection and Pose Estimation. In: Fleet D., Pajdla T., Schiele B., Tuytelaars T. (eds.), Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol. 8694, pp. 462–477, Springer, Cham, 2014. https://doi. org/10.1007/978-3-319-10599-4 30.
[33] Tombari F., Franchi A., Di L.: BOLD features to detect texture-less objects. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1265–1272. IEEE, 2013. https://doi.org/10.1109/ICCV.2013.160.
[34] Tsarouchi P., Matthaiakis S.A., Michalos G., Makris S., Chryssolouris G.: A method for detection of randomly placed objects for robotic handling, CIRP Journal of Manufacturing Science and Technology, vol. 14, pp. 20–27, 2016. https://doi.org/10.1016/j.cirpj.2016.04.005.
[35] Tsarouchi P., Michalos G., Makris S., Chryssolouris G.: Vision system for robotic handling of randomly placed objects, Procedia CIRP, vol. 9, pp. 61–66, 2013. https://doi.org/10.1016/j.procir.2013.06.169.
[36] Ulrich M., Wiedemann C., Steger C.: CAD-based recognition of 3D objects in monocular images. In: 2009 IEEE International Conference on Robotics and Automation, pp. 1191–1198, 2009. https://doi.org/10.1109/robot.2009.5152511.
[37] Ulrich M., Wiedemann C., Steger C.: Combining scale-space and similaritybased aspect graphs for fast 3D object recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34(10), pp. 1902–1914, 2012. https://doi.org/10.1109/TPAMI.2011.266.
[38] Wei H., Yang C., Yu Q.: Contour segment grouping for object detection, Journal of Visual Communication and Image Representation, vol. 48, pp. 292–309, 2017. https://doi.org/10.1016/j.jvcir.2017.07.003.
[39] Wei H., Yang C., Yu Q.: Efficient graph-based search for object detection, Information Sciences, vol. 385–386, pp. 395–414, 2017. https://doi.org/10.1016/j.ins. 2016.12.039.
[40] Wu J., Xue T., Lim J.J., Tian Y., Tenenbaum J.B., Torralba A., Freeman W.T.: Single Image 3D Interpreter Network. In: Leibe B., Matas J., Sebe N., Welling M. (eds.), Computer Vision – ECCV 2016. ECCV 2016. Lecture Notes in Computer Science, vol. 9910, pp. 365–382, Springer, Cham, 2016. https://doi.org/10.1007/ 978-3-319-46466-4 22.
[41] Xu X., Tian L., Feng J., Zhou J.: OSRI: A rotationally invariant binary descriptor, IEEE Transactions on Image Processing, vol. 23(7), pp. 2983–2995, 2014. https://doi.org/10.1109/TIP.2014.2324824.
[42] Zhao Z., Peng G., Wang H., Fang H.S., Li C., Lu C.: Estimating 6D Pose From Localizing Designated Surface Keypoints, 2018. http://arxiv.org/abs/1812. 01387.
[43] Zhu M., Derpanis K.G., Yang Y., Brahmbhatt S., Zhang M., Phillips C., Lecce M., Daniilidis K.: Single image 3D object detection and pose estimation for grasping. In: Proceedings – IEEE International Conference on Robotics and Automation, pp. 3936–3943, IEEE, 2014. https://doi.org/10.1109/ICRA.2014. 6907430.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-02b24250-3d7e-48bf-b48c-144c54fb793a