
DreamVu
Real-world human capture more than doubles humanoid manipulation success on retail benchmark outpacing simulation-only fine-tuning of NVIDIA GR00T N1.6 by 2.19×
The data robots actually need is the data only the real world can produce — and we’ve now demonstrated, on NVIDIA’s most capable humanoid foundation model, exactly what that data does.”— Rajat Aggarwal, CEO and Co-Founder of DreamVu
PHILADELPHIA, PA, UNITED STATES, May 12, 2026 /
EINPresswire.com/ --
DreamVu today released
SABER, a high-fidelity humanoid robotics dataset built from 100+ hours of natural in-store human activity, along with research demonstrating that post-training NVIDIA’s GR00T N1.6 vision-language-action model on SABER more than doubles its success rate on retail manipulation tasks — without a single robot in the loop.
Out of the box, GR00T N1.6 — NVIDIA’s flagship humanoid foundation model, pretrained on large-scale teleoperated demonstrations — scores near zero on RoboBenchMart, a standardized retail manipulation benchmark covering shelf interaction, refrigerator articulation, basket loading, and floor retrieval. With SABER applied as a domain-specific post-training layer, the same model reaches 29.3% mean success across ten tasks, a 2.19× improvement over simulation-only fine-tuning (13.4%). Fridge interaction tasks reach 82–100% success. The paper’s conclusion is direct: this is a data failure, not a model failure — and it is closed by real-world data, not more synthetic data.
“Synthetic data has been the industry’s default path to scale in humanoid training, and SABER shows the ceiling of that approach. A humanoid robot doesn’t deploy into a render — it deploys into a real store, with real packaging, real lighting, and the messy repetition of real human behavior. The data robots actually need is the data only the real world can produce — and we’ve now demonstrated, on NVIDIA’s most capable humanoid foundation model, exactly what that data does that synthetic approaches cannot.”
— Rajat Aggarwal, CEO and Co-Founder of DreamVu
The result lands on a structural debate inside humanoid robotics: how to acquire training data at the scale and fidelity that physical deployment demands. Teleoperation delivers real dynamics but is prohibitively slow and operationally disruptive to scale inside live environments. Simulation is fast and cheap but carries a well-documented sim-to-real gap — synthetic pixels render cleanly but produce policies that fail under real-world contact, occlusion, deformable packaging, and the long tail of in-store variation. SABER charts a third path: high-fidelity human capture, retargeted to robot embodiments through rigorous annotation, producing real-world action signal at scale — without a robot, or a simulator, in the loop.
SABER is built from DreamVu’s proprietary dual-stream capture system. Head-mounted GoPro cameras worn by primary actors record fine-grained hand activity at the point of interaction, while DreamVu’s ALIA 360° camera simultaneously observes the full scene from a single fixed unit. The resulting corpus contains approximately 44.8K training samples across three complementary streams: 25K latent action sequences from egocentric video, 18.6K dexterous hand-pose trajectories retargeted to robot joint space, and 1.2K whole-body motion sequences retargeted to the Unitree G1 humanoid embodiment. The dataset spans multiple real grocery store environments across diverse layouts, lighting conditions, and product assortments. No staging, scripting, or teleoperation was required.
A 10K-sample subset of SABER is being released publicly today under a CC BY-NC 4.0 license on
Hugging Face for research use. Commercial licensing of the full corpus is available through DreamVu. The company is actively engaged with humanoid robotics labs, foundation model providers, and enterprise robotics deployments on data licensing for production training.
ABOUT DREAMVU
DreamVu builds the data infrastructure for Physical AI and humanoids. Founded from breakthrough computational imaging research at IIIT Hyderabad and refined across eight years of production deployment, DreamVu’s omnidirectional 3D capture platform, end-to-end annotation pipeline, and simulation-ready data outputs power the training of next-generation humanoid robots and embodied AI systems.
SABER follows DreamVu’s PRISM dataset, published in March 2026, which achieved a 66.6% error reduction across embodied reasoning, spatial perception, and intuitive physics benchmarks when applied to NVIDIA Cosmos-Reason2. The company is headquartered in Philadelphia, PA with R&D in Hyderabad, India.
MEDIA & RESEARCH
Media Contact: [Sanju Pillai] · [sanju@dreamvu.ai] · [1 (267) 914-5213]
SABER paper and dataset: dreamvu.ai/saber
Public 10K dataset: huggingface.co/datasets/DreamVu/SABER-10K
Company: dreamvu.ai
Sanju Pillai
DreamVu
+1 (267) 914-5213
email us here
Visit us on social media:
LinkedIn
Legal Disclaimer:
EIN Presswire provides this news content "as is" without warranty of any kind. We do not accept any responsibility or liability
for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this
article. If you have any complaints or copyright issues related to this article, kindly contact the author above.