MOT-DETR : 3D single shot detection and tracking with transformers to build 3D representations for agro-food robots
In the current demand for automation in the agro-food industry, accurately detecting and localizing relevant objects in 3D is essential for successful robotic operations. However, this is a challenge due the presence of occlusions. Multi-view perception approaches allow robots to overcome occlusions, but a tracking component is needed to associate the objects detected by the robot over multiple viewpoints. Most multi-object tracking (MOT) algorithms are designed for high frame rate sequences and struggle with the occlusions generated by robots’ motions and 3D environments. In this paper, we introduce MOT-DETR, a novel approach to detect and track objects in 3D over time using a combination of convolutional networks and transformers. Our method processes 2D and 3D data, and employs a transformer architecture to perform data fusion. We show that MOT-DETR outperforms state-of-the-art multi-object tracking methods. Furthermore, we prove that MOT-DETR can leverage 3D data to deal with long-term occlusions and large frame-to-frame distances better than state-of-the-art methods. Finally, we show how our method is resilient to camera pose noise that can affect the accuracy of point clouds. The implementation of MOT-DETR can be found here: https://github.com/drapado/mot-detr.
Main Authors: | Rapado-Rincon, David, Nap, Henk, Smolenova, Katarina, van Henten, Eldert J., Kootstra, Gert |
---|---|
Format: | Article/Letter to editor biblioteca |
Language: | English |
Subjects: | Deep learning, Multi-object tracking, Robotics, Transformers, |
Online Access: | https://research.wur.nl/en/publications/mot-detr-3d-single-shot-detection-and-tracking-with-transformers- |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Development and evaluation of automated localisation and reconstruction of all fruits on tomato plants in a greenhouse based on multi-view perception and 3D multi-object tracking
by: Rapado-Rincón, David, et al. -
Robust node detection and tracking in fruit-vegetable crops using deep learning and multi-view imaging
by: Boogaard, Frans P., et al. -
MinkSORT : A 3D deep feature extractor using sparse convolutions to improve 3D multi-object tracking in greenhouse tomato plants
by: Rapado-Rincón, David, et al. -
ChickTrack - A Quantitative Tracking Tool for Measuring Chicken Activity
by: Neethirajan, S.R. -
Automatic discard registration in cluttered environments using deep learning and object tracking: class imbalance, occlusion, and a comparison to human review
by: van Essen, Rick, et al.