Nvidia just dropped what it’s calling the first open omni-model built specifically for physical AI. Cosmos 3, unveiled on May 31, integrates reasoning, world generation, and action capabilities into a single system designed to help robots and autonomous vehicles actually understand the messy, unpredictable real world.
Cosmos 3 can generate predictive video sequences of up to 30 seconds based on text, image, or video inputs, essentially letting a robot “imagine” what happens next in its environment before it moves a single actuator.
What Cosmos 3 actually does
Cosmos 3 uses what Nvidia calls a Mixture of Transformers architecture to process multiple types of input simultaneously. The model supports sound and action modalities, meaning a robot equipped with Cosmos 3 can process what it sees, hears, and does in a unified framework.
The practical application centers on something called robot policy learning. Cosmos 3 serves as a backbone for what Nvidia terms World Action Models, or WAMs, which allow embodied agents to operate across environments they’ve never encountered before.
Building on a foundation laid in 2025
Nvidia released several earlier iterations throughout 2025, including variants focused on prediction, transfer learning, and reasoning. Those earlier models already attracted serious customers.
Figure AI, the humanoid robotics company, adopted Cosmos technology for its bipedal robots. Agility Robotics, another humanoid player, did the same. On the autonomous vehicle side, Uber, Waabi, and Wayve all leveraged previous Cosmos versions for their self-driving efforts.
What this means for investors and the broader market
For the robotics industry specifically, the open nature of Cosmos 3 could accelerate adoption among smaller players who lack the resources to build their own world models from scratch. Synthetic data generation, one of the model’s core capabilities, addresses what has historically been the biggest bottleneck in robotics development: getting enough real-world training data without destroying expensive hardware in the process.
Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.

11 hours ago
22








English (US) ·