SG3D contains 3D scenes curated from diverse existing datasets of real environments. Harnessing the power of 3D scene graphs and GPT-4, we introduce an automated pipeline to generate tasks. Post-generation, we manually verify the test set data to ensure data quality.
Here we present a few examples from the SG3D dataset via a data explorer. Each task example consists of a sequence of steps, where each step requiring grounding a target object in the scene.
To use the data explorer, first select from the available scenes in the selection bar. The tasks and their corresponding steps will be displayed in the right column. Click on a step to visualize its target object with a red bounding box in the scene. All available objects could be found according to the segmentation visualization. Best viewed on monitors.
Control: Click + Drag = Rotate Ctrl + Drag = Translate Scroll Up/Down = Zoom In/Out
For task-oriented sequential navigation, each task represents a navigation episode. The agent is required to sequentially navigate to the target objects in the scene. The following videos show the navigation episodes in the SG3D-Nav dataset.
We propose SG-LLM for the sequential grounding task. Benefits from the stepwise grounding paradigm and a sequential adapter mechanism, SG-LLM outperforms other baselines by a large margin.
We evaluate the following grounding approaches on the SG3D benchmark: 3D-VG baselines, LLM methods, 3D LLM baseline, large vision-laguage model baseline, and our proposed sequential grounding model SG-LLM.
We benchmark two approaches on the SG3D-Nav benchmark: a modular agent and an end-to-end policy.
The significant performance degradation of grounding and navigation methods when sequential context is removed indicates that the sequential context is crucial for both grounding and navigation tasks.
@article{sg3d,
title={Task-oriented Sequential Grounding and Navigation in 3D Scenes},
author={Zhang, Zhuofan and Zhu, Ziyu and Li, Junhao and Li, Pengxiang and Wang, Tianxu and Liu, Tengyu and Ma, Xiaojian and Chen, Yixin and Jia, Baoxiong and Huang, Siyuan and Li, Qing},
journal={arXiv preprint arXiv:2408.04034}
year={2024}
}