DreamDojo GR-1 Post-Training

NVIDIA · 2026 · datasets.bot · datasets.bot page

One-liner. DreamDojo's released GR-1 humanoid post-training data — teleoperated trajectories + evaluation sets. (The 44.7k-hr DreamDojo-HV human-video pretraining corpus is in-house and NOT released.)

Setup

Datasets / benchmarks: The publicly released dataset accompanying DreamDojo, NVIDIA's generalist robot world model (arXiv 2602.06949, ICML 2026). DreamDojo is pretrained on DreamDojo-HV — a 44.7k-hour in-house egocentric human-video corpus that is **not** publicly released. What is released (Feb 2026) is the GR-1 post-training data: teleoperated trajectories on the Fourier GR-1 humanoid plus evaluation sets, used to post-train and evaluate the world model. Hosted on Hugging Face as nvidia/PhysicalAI-Robotics-GR00T-Teleop-GR1 (LeRobot / parquet+video), CC-BY-NC-4.0. 2B/14B model checkpoints and pre/post-training code are also released. License: CC-BY-NC-4.0. Download: https://huggingface.co/datasets/nvidia/PhysicalAI-Robotics-GR00T-Teleop-GR1.
Hardware / simulator: Embodiment: humanoid_other. Environment: home, lab, kitchen, tabletop. Realness: physical.

Schema

DreamDojo-HV: human egocentric RGB videos (640x480) with GPT-derived language task annotations; released robot subset (GR-1 teleop, LeRobot-style parquet): episodes -> steps -> {observation (video, state/proprioception), action, task index, fine/coarse human annotations, reward, done}.

DreamDojo GR-1 Post-Training

Setup

Schema

Links