TLA: Tactile-Language-Action Model for Contact-Rich Manipulation

Hao, Zhang, Li, Cao, Hao, Cui, Wang · 2025 · arXiv preprint · arXiv:2503.08548 · PDF

Dongyu supplement. Added as a candidate missing/adjacent dataset paper after searching primary arXiv sources. This is a draft summary for triage, not a full paper read.

One-liner. A tactile-language-action model for contact-rich peg-in-hole manipulation, paired with a 24k tactile action instruction dataset that is explicitly released with data and code.

Setup

Datasets / benchmarks: Introduces a tactile action instruction dataset with 24k pairs customized for fingertip peg-in-hole assembly. The arXiv abstract says the authors publicly release all data and code. This is a direct fit for language-conditioned tactile manipulation.
Hardware / simulator: Contact-rich fingertip peg-in-hole setup with sequential tactile feedback. The public abstract does not specify all hardware details; inspect the PDF/project page for sensor model and robot embodiment.

Method

Cross-modal language grounding over tactile sequences to generate robust contact-rich actions.

Why it matters for the map

This is one of the cleanest missing items for the current map: it combines tactile sensing, language, action generation, contact-rich assembly, and open data.

Limitations / open questions

Narrow task family: peg-in-hole assembly. It is excellent for contact-rich language grounding, but not a broad dynamic state-change benchmark.

Source note

arXiv lines 31-41 report title, dataset size, release claim, and project website.