TartanAir V2

CMU AirLab · 2024 · datasets.bot · datasets.bot page

One-liner. Next-generation photorealistic synthetic SLAM and navigation dataset from CMU AirLab, spanning 65 Unreal Engine environments with multimodal sensor data (RGB, depth, segmentation, optical flow, LiDAR, IMU, event cameras) and customizable pinhole, fisheye, and equirectangular camera models.

Setup

Schema

Per-environment trajectories with 12 synchronized cameras (two 6-camera stereo rings covering 360 degrees), 640x640 at 10 Hz, 0.25 m baseline. Per timestep: stereo RGB PNG (8-bit), float32 depth (stored as 4-channel 8-bit PNG), semantic segmentation (1447 classes via seg_label_map.json), camera poses; derived modalities: optical flow (npz with covisibility/FoV masks), LiDAR point clouds (VLP-16/VLP-32C), IMU with noise, event-camera streams from 1000 Hz MP4 via ESIM, occupancy maps; customizable pinhole/fisheye/equirectangular camera sampling.

Links