DeepMind thinks its new Genie 3 world mannequin presents a stepping stone towards AGI

Google DeepMind has revealed Genie 3, its newest basis world mannequin that can be utilized to coach general-purpose AI brokers, a functionality that the AI lab says makes for a vital stepping stone on the trail to “synthetic normal intelligence,” or human-like intelligence.

“Genie 3 is the primary real-time interactive normal goal world mannequin,” Shlomi Fruchter, a analysis director at DeepMind, stated throughout a press briefing. “It goes past slender world fashions that existed earlier than. It’s not particular to any specific atmosphere. It will probably generate each photo-realistic and imaginary worlds, and every part in between.”

Nonetheless in analysis preview and never publicly out there, Genie 3 builds on each its predecessor Genie 2 (which may generate new environments for brokers) and DeepMind’s newest video technology mannequin Veo 3 (which is claimed to have a deep understanding of physics).

With a easy textual content immediate, Genie 3 can generate a number of minutes of interactive 3D environments at 720p decision at 24 frames per second — a major soar from the ten to twenty seconds Genie 2 may produce. The mannequin additionally options “promptable world occasions,” or the flexibility to make use of a immediate to alter the generated world.

Maybe most significantly, Genie 3’s simulations keep bodily constant over time as a result of the mannequin can keep in mind what it beforehand generated — a functionality that DeepMind says its researchers didn’t explicitly program into the mannequin.

Fruchter stated that whereas Genie 3 has implications for academic experiences, gaming or prototyping inventive ideas, its actual unlock will manifest in coaching brokers for normal goal duties, which he stated is crucial to reaching AGI.

“We expect world fashions are key on the trail to AGI, particularly for embodied brokers, the place simulating actual world eventualities is especially difficult,”Jack Parker-Holder, a analysis scientist on DeepMind’s open-endedness crew, stated through the briefing.

Techcrunch occasion

San Francisco
|
October 27-29, 2025

Genie 3 is supposedly designed to unravel that bottleneck. Like Veo, it doesn’t depend on a hard-coded physics engine; as a substitute, DeepMind says, the mannequin teaches itself how the world works – how objects transfer, fall, and work together – by remembering what it has generated and reasoning over very long time horizons.

“The mannequin is auto-regressive, which means it generates one body at a time,” Fruchter informed TechCrunch in an interview. “It has to look again at what was generated earlier than to resolve what’s going to occur subsequent. That’s a key a part of the structure.”

That reminiscence, the corporate says, lends to consistency in Genie 3’s simulated worlds, which in flip permits it to develop a grasp of physics, just like how people perceive {that a} glass teetering on the sting of a desk is about to fall, or that they need to duck to keep away from a falling object.

Notably, DeepMind says the mannequin additionally has the potential to push AI brokers to their limits — forcing them to study from their very own expertise, just like how people study in the true world.

For example, DeepMind shared its take a look at of Genie 3 with a current model of its generalist Scalable Instructable Multiworld Agent (SIMA), instructing it to pursue a set of objectives. In a warehouse setting, they requested the agent to carry out duties like “method the intense inexperienced trash compactor” or “stroll to the packed crimson forklift.”

“In all three circumstances, the SIMA agent is ready to obtain the aim,” Parker-Holder stated. “It simply receives the actions from the agent. So the agent takes the aim, sees the world simulated round it, after which takes the actions on the earth. Genie 3 simulates ahead, and the truth that it’s in a position to obtain it’s as a result of Genie 3 stays constant.”

That stated, Genie 3 has its limitations. For instance, whereas the researchers declare it will possibly perceive physics, the demo exhibiting a skier barreling down a mountain didn’t replicate how snow would transfer in relation to the skier.

Moreover, the vary of actions an agent can take is proscribed. For instance, the prompt-able world occasions permit for a variety of environmental interventions, however they’re not essentially carried out by the agent itself. And it’s nonetheless tough to precisely mannequin complicated interactions between a number of unbiased brokers in a shared atmosphere.

Genie 3 also can solely assist a couple of minutes of steady interplay, when hours can be essential for correct coaching.

Nonetheless, the mannequin presents a compelling step ahead in educating brokers to transcend reacting to inputs, letting them probably plan, discover, search out uncertainty, and enhance by means of trial and error – the sort of self-driven, embodied studying that many say is vital to shifting in the direction of normal intelligence.

“We haven’t actually had a Transfer 37 second for embodied brokers but, the place they’ll really take novel actions in the true world,” Parker-Holder stated, referring to the legendary second within the 2016 recreation of Go between DeepMind’s AI agent AlphaGo and world champion Lee Sedol, during which Alpha Go performed an unconventional and sensible transfer that grew to become symbolic of AI’s potential to find new methods past human understanding.

“However now, we are able to probably usher in a brand new period,” he stated.