arxiv:2602.18742

RoboCurate: Harnessing Diversity with Action-Verified Neural Trajectory for Robot Learning

Published on Feb 21

· Submitted by

Suhyeok Jang on Feb 24

KAIST AI

Upvote

Authors:

Abstract

RoboCurate enhances synthetic robot learning data by evaluating action quality through simulator replay consistency and augmenting observation diversity via image editing and video transfer techniques.

AI-generated summary

Synthetic data generated by video generative models has shown promise for robot learning as a scalable pipeline, but it often suffers from inconsistent action quality due to imperfectly generated videos. Recently, vision-language models (VLMs) have been leveraged to validate video quality, but they have limitations in distinguishing physically accurate videos and, even then, cannot directly evaluate the generated actions themselves. To tackle this issue, we introduce RoboCurate, a novel synthetic robot data generation framework that evaluates and filters the quality of annotated actions by comparing them with simulation replay. Specifically, RoboCurate replays the predicted actions in a simulator and assesses action quality by measuring the consistency of motion between the simulator rollout and the generated video. In addition, we unlock observation diversity beyond the available dataset via image-to-image editing and apply action-preserving video-to-video transfer to further augment appearance. We observe RoboCurate's generated data yield substantial relative improvements in success rates compared to using real data only, achieving +70.1% on GR-1 Tabletop (300 demos), +16.1% on DexMimicGen in the pre-training setup, and +179.9% in the challenging real-world ALLEX humanoid dexterous manipulation setting.

View arXiv page View PDF Project page Add to collection

Community

glory-hyeok

Paper submitter about 9 hours ago

🚀 RoboCurate — Harnessing Diversity with Action-Verified Neural Trajectory for Robot Learning
📄 Paper: https://arxiv.org/abs/2602.18742

🔎 What we do:

Propose RoboCurate, a synthetic robot data generation framework that improves neural trajectory quality through action-level verification and controlled diversity augmentation.

Introduce simulator-replay consistency filtering, which replays IDM-predicted actions in a simulator and evaluates motion alignment between generated videos and simulator rollouts using an attentive probe built on a frozen video encoder.

Expand observation diversity via a structured I2I (image-to-image) + V2V (video-to-video) pipeline, increasing scene and appearance variation while preserving action dynamics.

Apply a Best-of-N sampling strategy, where the learned action-consistency score serves as a critic to select the most reliable synthetic trajectory during generation.

Validate across large-scale simulation and real-world benchmarks (GR-1 Tabletop, DexMimicGen, and real ALLEX humanoid), achieving:
• +70.1% relative improvement on GR-1 Tabletop (300 demos)
• +16.1% on DexMimicGen (pre-training setup)
• +179.9% on real-world ALLEX humanoid
• Strong OOD gains (+162.3% on novel object tasks; emergent 0% → 25% on novel behaviors)

💡 Why it matters:

RoboCurate establishes an action-verified neural trajectory paradigm for synthetic robot data.

Rather than relying solely on video-level plausibility judgments from VLMs, RoboCurate directly evaluates whether predicted actions are physically consistent with observed motion through simulator grounding.

By jointly increasing visual diversity and enforcing action-level correctness, RoboCurate significantly enhances robustness, generalization, and real-world transfer of Vision-Language-Action models — enabling more reliable policy learning from curated synthetic experience.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.18742 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.18742 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.18742 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.