Gap Analysis Draft

Generated from the expanded 91-paper catalog.

Corpus Snapshot

Paper summaries scanned: 91.
Papers with dataset-related evidence: 82.
Papers that appear to introduce a new dataset, benchmark, or simulator: 34.
Papers with open or partial data availability evidence: 31.

Classification Used

introduces dataset: the paper appears to contribute a named or newly collected dataset.
introduces benchmark: the paper contributes a benchmark or dataset+benchmark artifact.
introduces simulator: the paper contributes a simulator or simulation platform.
self-collected eval data: the paper collects task-specific data for evaluation/training but does not clearly present it as a reusable dataset.
uses existing datasets: the paper trains/evaluates on existing datasets or benchmarks.
sensor / foundation paper: foundational sensor/model/theory work where dataset contribution is not the main artifact.
survey / review: survey papers.

Early Trends

The survey is data-rich, but much of the data is task-specific or benchmark-use rather than reusable, open, language-annotated manipulation data.

Vision and tactile are heavily represented. Audio appears often enough to be a serious thread. Force/proprioception are control-critical but less often packaged as reusable language-grounded datasets. Thermal remains sparse.

Annotation coverage is uneven. Class/property labels, self-supervised pairs, demonstrations, and simulation labels are common. Explicit predicates, captions/descriptions, instructions, temporal event labels, and state-transition annotations are much less common.

Likely Contribution Gaps

Open dynamic state-change datasets. The field has many static property or representation datasets, but fewer open datasets with temporal labels such as onset, transition, completion, failure, or correction.
Language + non-visual monitoring. Touch-language exists, but language grounded in audio, force, proprioception, and thermal signals is much thinner.
Thermal-language manipulation data. Thermal appears in cooking/material-state papers, but the catalog still suggests little reusable language-grounded thermal manipulation data.
Predicate-level annotations. The BLADE-style question needs predicates like is_grasped, is_inserted, is_full, is_screwed_tight, or boiling_now; these are less common than category/property labels.
Reusable closed-loop correction datasets. Several methods do online monitoring or recovery, but the reusable dataset artifact for closed-loop correction remains unclear or closed in many rows.

Next Manual Review Targets

Start with rows classified as:

introduces dataset
introduces benchmark
introduces simulator
unknown openness

High-value papers to verify manually: