TartanDrive
CMU AirLab · 2022 · datasets.bot · datasets.bot page
One-liner. Large-scale real-world off-road driving dataset (~200k interactions, ~5 hours) collected on a modified Yamaha Viking ATV with seven sensing modalities for learning off-road vehicle dynamics models.
Setup
- Datasets / benchmarks: TartanDrive is a large-scale real-world off-road driving dataset from CMU's AirLab, built for learning off-road vehicle dynamics models. It contains roughly 200,000 off-road driving interactions (about five hours of data) collected on a modified Yamaha Viking ATV driven across diverse terrain. The authors describe it as the largest real-world multimodal off-road driving dataset both in number of interactions and in number of sensing modalities. It provides seven unique sensing modalities: stereo RGB camera imagery (1024x512 @ 20Hz from a MultiSense S21), bird's-eye-view RGB maps (501x501 @ 20Hz) and heightmaps (501x501 @ 20Hz) derived from the stereo RGB-D mapping pipeline (not LiDAR), IMU (6D @ 200Hz), GPS-derived odometry/state (7D @ 50Hz), and vehicle proprioception including wheel RPM (4D @ 50Hz), shock/suspension position (4D @ 50Hz), and pedal/throttle commands (2D action @ 100Hz). Data are distributed as compressed rosbags (~100GB per compressed folder) that can be converted to PyTorch tensors or NumPy arrays via the provided scripts at github.com/castacks/tartan_drive. The dataset was introduced at ICRA 2022 and benchmarks state-of-the-art model-based RL methods for off-road dynamics prediction, showing that multi-modality improves prediction especially on challenging terrain. A follow-up, TartanDrive 2.0, adds LiDAR (two Velodyne VLP-32 and a Livox Mid-70) and seven hours of data. License: CC-BY-4.0. Download: https://github.com/castacks/tartan_drive.
- Hardware / simulator: Embodiment: not listed. Environment: outdoor. Realness: physical.
Schema
Per-timestep multimodal: state/odometry (7D, 50Hz), action/command (2D, 100Hz), RGB image (1024x512, 20Hz), RGB map (501x501, 20Hz), heightmap (501x501, 20Hz), IMU (6D, 200Hz), shock position (4D, 50Hz), wheel RPM (4D, 50Hz), pedals (2D, 50Hz). Maps/heightmaps stereo-derived from MultiSense S21 RGB-D. Stored as rosbags, convertible to torch/numpy.
Links