MOT-DETR : 3D single shot detection and tracking with transformers to build 3D representations for agro-food robots
In the current demand for automation in the agro-food industry, accurately detecting and localizing relevant objects in 3D is essential for successful robotic operations. However, this is a challenge due the presence of occlusions. Multi-view perception approaches allow robots to overcome occlusions, but a tracking component is needed to associate the objects detected by the robot over multiple viewpoints. Most multi-object tracking (MOT) algorithms are designed for high frame rate sequences and struggle with the occlusions generated by robots’ motions and 3D environments. In this paper, we introduce MOT-DETR, a novel approach to detect and track objects in 3D over time using a combination of convolutional networks and transformers. Our method processes 2D and 3D data, and employs a transformer architecture to perform data fusion. We show that MOT-DETR outperforms state-of-the-art multi-object tracking methods. Furthermore, we prove that MOT-DETR can leverage 3D data to deal with long-term occlusions and large frame-to-frame distances better than state-of-the-art methods. Finally, we show how our method is resilient to camera pose noise that can affect the accuracy of point clouds. The implementation of MOT-DETR can be found here: https://github.com/drapado/mot-detr.
Main Authors: | , , , , |
---|---|
Format: | Article/Letter to editor biblioteca |
Language: | English |
Subjects: | Deep learning, Multi-object tracking, Robotics, Transformers, |
Online Access: | https://research.wur.nl/en/publications/mot-detr-3d-single-shot-detection-and-tracking-with-transformers- |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
dig-wur-nl-wurpubs-633318 |
---|---|
record_format |
koha |
spelling |
dig-wur-nl-wurpubs-6333182024-10-30 Rapado-Rincon, David Nap, Henk Smolenova, Katarina van Henten, Eldert J. Kootstra, Gert Article/Letter to editor Computers and Electronics in Agriculture 225 (2024) ISSN: 0168-1699 MOT-DETR : 3D single shot detection and tracking with transformers to build 3D representations for agro-food robots 2024 In the current demand for automation in the agro-food industry, accurately detecting and localizing relevant objects in 3D is essential for successful robotic operations. However, this is a challenge due the presence of occlusions. Multi-view perception approaches allow robots to overcome occlusions, but a tracking component is needed to associate the objects detected by the robot over multiple viewpoints. Most multi-object tracking (MOT) algorithms are designed for high frame rate sequences and struggle with the occlusions generated by robots’ motions and 3D environments. In this paper, we introduce MOT-DETR, a novel approach to detect and track objects in 3D over time using a combination of convolutional networks and transformers. Our method processes 2D and 3D data, and employs a transformer architecture to perform data fusion. We show that MOT-DETR outperforms state-of-the-art multi-object tracking methods. Furthermore, we prove that MOT-DETR can leverage 3D data to deal with long-term occlusions and large frame-to-frame distances better than state-of-the-art methods. Finally, we show how our method is resilient to camera pose noise that can affect the accuracy of point clouds. The implementation of MOT-DETR can be found here: https://github.com/drapado/mot-detr. en application/pdf https://research.wur.nl/en/publications/mot-detr-3d-single-shot-detection-and-tracking-with-transformers- 10.1016/j.compag.2024.109275 https://edepot.wur.nl/671447 Deep learning Multi-object tracking Robotics Transformers https://creativecommons.org/licenses/by/4.0/ https://creativecommons.org/licenses/by/4.0/ Wageningen University & Research |
institution |
WUR NL |
collection |
DSpace |
country |
Países bajos |
countrycode |
NL |
component |
Bibliográfico |
access |
En linea |
databasecode |
dig-wur-nl |
tag |
biblioteca |
region |
Europa del Oeste |
libraryname |
WUR Library Netherlands |
language |
English |
topic |
Deep learning Multi-object tracking Robotics Transformers Deep learning Multi-object tracking Robotics Transformers |
spellingShingle |
Deep learning Multi-object tracking Robotics Transformers Deep learning Multi-object tracking Robotics Transformers Rapado-Rincon, David Nap, Henk Smolenova, Katarina van Henten, Eldert J. Kootstra, Gert MOT-DETR : 3D single shot detection and tracking with transformers to build 3D representations for agro-food robots |
description |
In the current demand for automation in the agro-food industry, accurately detecting and localizing relevant objects in 3D is essential for successful robotic operations. However, this is a challenge due the presence of occlusions. Multi-view perception approaches allow robots to overcome occlusions, but a tracking component is needed to associate the objects detected by the robot over multiple viewpoints. Most multi-object tracking (MOT) algorithms are designed for high frame rate sequences and struggle with the occlusions generated by robots’ motions and 3D environments. In this paper, we introduce MOT-DETR, a novel approach to detect and track objects in 3D over time using a combination of convolutional networks and transformers. Our method processes 2D and 3D data, and employs a transformer architecture to perform data fusion. We show that MOT-DETR outperforms state-of-the-art multi-object tracking methods. Furthermore, we prove that MOT-DETR can leverage 3D data to deal with long-term occlusions and large frame-to-frame distances better than state-of-the-art methods. Finally, we show how our method is resilient to camera pose noise that can affect the accuracy of point clouds. The implementation of MOT-DETR can be found here: https://github.com/drapado/mot-detr. |
format |
Article/Letter to editor |
topic_facet |
Deep learning Multi-object tracking Robotics Transformers |
author |
Rapado-Rincon, David Nap, Henk Smolenova, Katarina van Henten, Eldert J. Kootstra, Gert |
author_facet |
Rapado-Rincon, David Nap, Henk Smolenova, Katarina van Henten, Eldert J. Kootstra, Gert |
author_sort |
Rapado-Rincon, David |
title |
MOT-DETR : 3D single shot detection and tracking with transformers to build 3D representations for agro-food robots |
title_short |
MOT-DETR : 3D single shot detection and tracking with transformers to build 3D representations for agro-food robots |
title_full |
MOT-DETR : 3D single shot detection and tracking with transformers to build 3D representations for agro-food robots |
title_fullStr |
MOT-DETR : 3D single shot detection and tracking with transformers to build 3D representations for agro-food robots |
title_full_unstemmed |
MOT-DETR : 3D single shot detection and tracking with transformers to build 3D representations for agro-food robots |
title_sort |
mot-detr : 3d single shot detection and tracking with transformers to build 3d representations for agro-food robots |
url |
https://research.wur.nl/en/publications/mot-detr-3d-single-shot-detection-and-tracking-with-transformers- |
work_keys_str_mv |
AT rapadorincondavid motdetr3dsingleshotdetectionandtrackingwithtransformerstobuild3drepresentationsforagrofoodrobots AT naphenk motdetr3dsingleshotdetectionandtrackingwithtransformerstobuild3drepresentationsforagrofoodrobots AT smolenovakatarina motdetr3dsingleshotdetectionandtrackingwithtransformerstobuild3drepresentationsforagrofoodrobots AT vanhenteneldertj motdetr3dsingleshotdetectionandtrackingwithtransformerstobuild3drepresentationsforagrofoodrobots AT kootstragert motdetr3dsingleshotdetectionandtrackingwithtransformerstobuild3drepresentationsforagrofoodrobots |
_version_ |
1816149849728352256 |