Dongyu Supplemental Search Notes

Scope: additional dataset / benchmark papers for language-conditioned or multimodal robot manipulation not already emphasized in the professor package.

TLA: Tactile-Language-Action Model for Contact-Rich Manipulation - A tactile-language-action model for contact-rich peg-in-hole manipulation, paired with a 24k tactile action instruction dataset that is explicitly released with data and code.
Hoi! - A Multimodal Dataset for Force-Grounded, Cross-View Articulated Manipulation - A force-grounded articulated manipulation dataset spanning human and tool embodiments, designed to connect video, action, force, and tactile interaction signals.
RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation - A large multi-embodiment teleoperation dataset with language task descriptions, proprioception, multi-view observations, and failure demonstrations with causes.
RoboMIND 2.0: A Multimodal, Bimanual Mobile Manipulation Dataset for Generalizable Embodied Intelligence - A larger RoboMIND successor with 310k real-world dual-arm trajectories, tactile-enhanced episodes, mobile manipulation, and language-planner/VLA framing.
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset - A large open in-the-wild robot manipulation dataset: 76k trajectories, 350 hours, 564 scenes, and 84 tasks.
BridgeData V2: A Dataset for Robot Learning at Scale - An open robot manipulation dataset with 60,096 trajectories across 24 environments, compatible with natural-language and goal-image conditioning.
Open X-Embodiment: Robotic Learning Datasets and RT-X Models - A major cross-institution robot learning dataset aggregation and RT-X model release over many robots, tasks, and embodiments.
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning - A language-conditioned lifelong robot learning benchmark with four task suites, 130 tasks, and human teleoperated demonstrations.
RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in One-Shot - A public contact-rich manipulation dataset with more than 110k real-world sequences, visual/force/audio/action streams, human videos, and language descriptions.
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots - A large-scale simulated kitchen manipulation framework with assets, 100 tasks, LLM-guided composite tasks, demonstrations, and open-source code.
RoboCasa365: A Large-Scale Simulation Framework for Training and Benchmarking Generalist Robots - A 2026 RoboCasa extension with 365 household mobile-manipulation tasks, 2,500 kitchen environments, and large human/synthetic demonstration corpora.
RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins - A dual-arm manipulation benchmark that uses generative digital twins and LLM-assisted code generation to create diverse expert data and real-world-aligned evaluation.