Tytuł artykułu
Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
This paper describes the results of experiments on detection and recognition of 3D objects in RGB-D images provided by the Microsoft Kinect sensor. While the studies focus on single image use, sequences of frames are also considered and evaluated. Observed objects are categorized based on both geometrical and visual cues, but the emphasis is laid on the performance of the point cloud matching method. To this end, a rarely used approach consisting of independent VFH and CRH descriptors matching, followed by ICP and HV algorithms from the Point Cloud Library is applied. Successfully recognized objects are then subjected to a classical 2D analysis based on color histogram comparison exclusively with objects in the same geometrical category. The proposed two-stage approach allows to distinguish objects of similar geometry and different visual appearance, like soda cans of various brands. By separating geometry and color identification phases, the applied system is still able to categorize objects based on their geometry, even if there is no color match. The recognized objects are then localized in the three-dimensional space and autonomously grasped by a manipulator. To evaluate this approach, a special validation set was created, and additionally a selected scene from the Washington RGB-D Object Dataset was used.
Słowa kluczowe
Rocznik
Tom
Strony
220--237
Opis fizyczny
Bibliogr. 48 poz., fig., tab.
Twórcy
autor
- Instytut Sterowania i Elektroniki Przemysłowej, Politechnika Warszawska, ul. Koszykowa 75, 00-662 Warszawa
autor
- Instytut Sterowania i Elektroniki Przemysłowej, Politechnika Warszawska, ul. Koszykowa 75, 00-662 Warszawa
Bibliografia
- 1. Abadi M., et al., Tensor flow: Large-scale machine learning on heterogeneous systems, Software available from tensor flow. org, 2015.
- 2. Aldoma A. et al., CAD-model recognition and 6DOF pose estimation using 3D cues, IEEE International Conference on Computer Vision Workshops, 6-13 November 2011.
- 3. Aldoma A., et al., A global hypotheses verification method for 3D object recognition, Computer Vision-ECCV, 511-524, 2012.
- 4. Aldoma A., et al., Multimodal cue integration through Hypotheses Verification for RGB-D object recognition and 6DoF pose estimation, IEEE International Conference on Robotics and Automation (ICRA), 2104-2111, 2013.
- 5. Aldoma A., et al., OUR-CVFH - Oriented, Unique and Repeatable Clustered Viewpoint Feature Histogram for Object Recognition and 6DOF Pose Estimation, Pattern Recognition. DAGM/OAGM 2012. Lecture Notes in Computer Science, vol. 7476, 113-122, 2012.
- 6. Aldoma A., et al., Three-Dimensional Object Recognition and 6 DoF Pose Estimation, IEEE Robotics & Automation Magazine, 80-91, September 2012.
- 7. Aldoma A., Faulhammer T., Vincze M., Automation of “Ground Truth” Annotation for Multi-View RGB-D Object Instance Recognition Datasets, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 5016-5023, 2014.
- 8. Aldoma A., Rusu R.B., Vincze M., 0-Order Affordances through CAD-Model Recognition and 6DOF Pose Estimation, Active Semantic Perception and Object Search in the Real World Workshop, IROS, 2011.
- 9. Alexandre L.A., 3d object recognition using convolutional neural networks with transfer learning between input channels, 13th International Conference on Intelligent Autonomous Systems, 2014.
- 10. Bayramoglu N., Alatan A., Shape index SIFT: Range image recognition using local features, 20th International Conference on Pattern Recognition, 352-355, 2010.
- 11. Bergstra J., et al., Theano: Deep learning on gpus with python, NIPS Big Learning Workshop, Granada, Spain, 2011.
- 12. Besl P., McKay N., A Method for Registration of 3-D Shapes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 2, 1992.
- 13. Bo L., Ren X., Fox D., Unsupervised feature learning for RGB-D based object recognition. Experimental Robotics, 387-402. Springer, 2013.
- 14. Chang A.X., et al., Shapenet: An informationrich 3d model repository, arXiv preprint arXiv: 1512.03012, 2015.
- 15. Collobert R., Kavukcuoglu K., Farabet C., Torch7: A matlab-like environment for machine learning, in BigLearn, NIPS Workshop, no. EPFL-CONF-192376, 2011.
- 16. Eigen D., Fergus R., Predicting depth, surface normal and semantic labels with a common multi-scale convolutional architecture, International Conference on Computer Vision, 2650-2658, 2015.
- 17. Gupta S., et al., Aligning 3D models to RGB-D images of cluttered scenes, Inte. Conference on Computer Vision and Pattern Recognition, 4731-4740, 2015.
- 18. He K., et al., Spatial pyramid pooling in deep convolutional networks for visual recognition, 13th European Conference on Computer Vision, 346-361, 2014.
- 19. Hernandez C et al., Team Delft’s Robot Winner of the Amazon Picking Challenge, ArXiv eprints. arXiv: 1610.05514 [cs.RO], 2016.
- 20. Hernandez-Vela A. et al., BoVDW: Bag-of-Visual-and-Depth-Words for gesture recognition, International Conference on Pattern Recognition (ICPR), 2012.
- 21. Hinterstoisser S., et al., Dominant orientation templates for real-time detection of texture-less objects, International Conference on Computer Vision and Pattern Recognition (CVPR), 2010.
- 22. Jia Y., et al., Caffe: Convolutional architecture for fast feature embedding, arXiv preprint arXiv:1408.5093, 2014.
- 23. Jonschkowski R. et al., Probabilistic multi-class segmentation for the Amazon Picking Challenge, International Conference on Intelligent Robots and Systems (IROS), 2016.
- 24. Kolomyjec K., Czajewski W., Identification and localization of objects in RGD-D images for the purpose of manipulation, in Polish, Prace Naukowe Politechniki Warszawskiej. Elektronika, 195, 2, 377-386, 2016.
- 25. Kurban R., Skuka F., Bozpolat H., Plane Segmentation of Kinect Point Clouds using RANSAC, 7th International Conference on Information Technology, ICIT, Amman, Jordan, 2015, 545-551.
- 26. Lai K., et al., A Large-scale Hierarchical Multi-view RGB-D Object Dataset, IEEE International Conference on Robotics and Automation (ICRA), May 2011.
- 27. Lai K., et al., Sparse distance learning for object recognition combining rgb and depth information. IEEE International Conference on Robotics and Automation, (ICRA) May 2011.
- 28. Laptev D. et al., Ti-pooling: Transformation-invariant pooling for feature learning in convolutional neural networks, International Conference on Computer Vision and Pattern Recognition, 289-297, 2016.
- 29. Łępicka M., Komuta T., Stefańczyk M., Utilization of colour in ICP-based point cloud registration, 9th International Conference on Computer Recognition Systems, 821-830, 2015, 2016.
- 30. Martinez L., Loncomilla P., Ruiz-del-Solar J., Object recognition for manipulation tasks in real domestic settings: A comparative study, 18th RoboCup International Symposium, Lecture Notes in Computer Science. Springer, Joao Pessoa, Brazil, 2014.
- 31. Maturana D., Scherer S., Voxnet: A 3d convolutional neural network for real-time object recognition, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015.
- 32. Muja M., Lowe D.G., Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration, International Conference on Computer Vision Theory and Applications (VISAPP), 2009.
- 33. Nafouki K., Object recognition and pose estimation from an RGB-D image, Technical report, Technical University of Munich, 2016.
- 34. Narayanan V., Likhachev M., PERCH: Perception via Search for Multi-Object Recognition and Localization, IEEE International Conference on Robotics and Automation (ICRA), 2016.
- 35. Papazov C., Burschka D., An Efficient RANSAC for 3D Object Recognition in Noisy and Occluded Scenes, Asian Conference on Computer Vision, Part I, 135-148, 2010.
- 36. Porzi Z. et al, Learning Depth-Aware Deep Representations for Robotic Perception, IEEE Robotics and Automation Letters, Volume 2, Issue 2, April 2017.
- 37. Prankl J., et al., RGB-D Object Modeling for Object Recognition and Tracking, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015.
- 38. Rusu R.B., Cousins R., 3D is here: Point Cloud Library (PCL), IEEE International Conference on Robotics and Automation (ICRA), Shanghai, 9-13 May 2011.
- 39. Rusu R.B., et al. Fast 3D Recognition and Pose Using the Viewpoint Feature Histogram, 23rd IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Taipei, Taiwan, 10/2010, 2155-2162.
- 40. Rusu R.B., Blodow N., Beetz M., Fast point feature histograms (fpfh) for 3d registration, IEEE International Conference on Robotics and Automation (ICRA), Kobe, Japan, May 12-17 2009.
- 41. Socher R., et al., Convolutional-recursive deep learning for 3d object classification. Advances in Neural Information Processing Systems, 665-673, 2012.
- 42. Song S., Xiao J., Deep sliding shapes for a modal 3d object detection in RGB-D images, 29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- 43. Tang J., et al., A textured object recognition pipeline for color and depth image data, IEEE International Conference on Robotics and Automation, 3467-3474, 2012.
- 44. Tombari F., Salti S., Di Stefano L., Unique signatures of Histograms for local surface description, European Conference on Computer Vision (ECCV), 2010.
- 45. Wang A., et al., MMSS: Multi-modal sharable and specific feature learning for RGB-D object recognition, International Conference on Computer Vision, 1125-1133, 2015.
- 46. Wu Z., et al., 3dshapenets: A deep representation for volumetric shapes, IEEE Conference on Computer Vision and Pattern Recognition, 1912-1920, 2015.
- 47. Xie Z., et al., Multimodal blending for high-accuracy instance recognition, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2214-2221, 2013.
- 48. Zang Z., Iterative point matching for registration of free-form curves and surfaces, International Journal of Computer Vision, Volume 13, Issue 2, 119-152, 1994.
Uwagi
Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę (zadania 2017).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-480f98ba-e81d-4e89-819e-feb2090c8b38