TartanAviation
CMU AirLab · 2024 · datasets.bot · datasets.bot page
One-liner. Multimodal terminal-airspace dataset from CMU AirLab: ground/sky RGB imagery of aircraft, ATC speech audio, and ADS-B aircraft trajectories collected at general-aviation airports near Pittsburgh.
Setup
- Datasets / benchmarks: TartanAviation is an open-source multimodal dataset focused on terminal-area airspace operations at general-aviation airports, providing a holistic view of the airport environment by concurrently collecting image, speech, and ADS-B trajectory data using sensor setups installed within airport boundaries. It contains 3.1 million images across 550 sequences (captured with a 4-camera array of Sony IMX264 RGB sensors at 2048x2448 resolution and 24 FPS, stored as MP4/AVI), 3,374 hours of Air Traffic Control (ATC) speech audio (WAV, 44.1 kHz; 477.6 hours above a -20 dB activity threshold), and 661 days of ADS-B aircraft trajectory data (CSV/TXT; ~63 million raw position reports with fields such as ID, timestamp, altitude MSL, speed, heading, lat/long, wind components, range, and bearing). Data were collected at two airfields in the Greater Pittsburgh area: Allegheny County Airport (KAGC, towered) and Pittsburgh-Butler Regional Airport (KBTP, non-towered), spanning multiple months and seasons (vision Dec 2021-Feb 2023 at KAGC; trajectory/speech data Sept 2020-Feb 2023 across both airports) to capture diversity in aircraft operations, aircraft types, and weather. Post-processed (synchronized, filtered, interpolated) versions are also provided. Cameras are static ground-based installations imaging crewed general-aviation aircraft against the sky (no drones/UAVs are involved); no depth, stereo, or fisheye sensing is used. Recording/post-processing/download scripts are released on GitHub, and the dataset is openly hosted on Hugging Face, CMU AirLab servers, and Zenodo. License: CC-BY-4.0. Download: https://github.com/castacks/TartanAviation.
- Hardware / simulator: Embodiment: not listed. Environment: outdoor. Realness: physical.
Schema
Images: MP4/AVI video (2048x2448, 24 FPS, 4-camera Sony IMX264 RGB array), 3.1M frames / 550 sequences. Audio: WAV, 44.1 kHz ATC speech, 3,374 hours (477.6 hrs above -20 dB). Trajectories: ADS-B CSV/TXT with ID, timestamp, altitude (ft MSL), speed (kts), heading (deg), lat/long, wind components, range (km), bearing; ~63M position reports over 661 days.
Links