Google DeepMind’s Genie 2 can generate interactive 3D worlds

World models — AI algorithms capable of generating a simulated environment in real-time — represent one of the more impressive applications of machine learning. In the last year, there’s been a lot of movement in the field, and to that end, Google DeepMind announced Genie 2 on Wednesday. Where its predecessor was limited to generating 2D worlds, the new model can create 3D ones and sustain them for significantly longer.

Genie 2 isn’t a game engine; instead, it’s a diffusion model that generates images as the player (either a human being or another AI agent) moves through the world the software is simulating. As it generates frames, Genie 2 can infer ideas about the environment, giving it the capability to model water, smoke and physics effects — though some of those interactions can be very gamey. The model is also not limited to rendering scenes from a third-person perspective, it can also handle first-person and isometric viewpoints. All it needs to start is a single image prompt, provided either by Google’s own Imagen 3 model or a picture of something from the real world.

Introducing Genie 2: our AI model that can create an endless variety of playable 3D worlds – all from a single image. 🖼️
These types of large-scale foundation world models could enable future agents to be trained and evaluated in an endless number of virtual environments. →… pic.twitter.com/qHCT6jqb1W

— Google DeepMind (@GoogleDeepMind) December 4, 2024

Notably, Genie 2 can remember parts of a simulated scene even after they leave the player’s field of view and can accurately reconstruct those elements once they become visible again. That’s in contrast to other world models like Oasis, which, at least in the version Decart showed to the public in October, had trouble remembering the layout of the Minecraft levels it was generating in real time.

However, there are even limitations to what Genie 2 can do in this regard. DeepMind says the model can generate “consistent” worlds for up to 60 seconds, with the majority of the examples the company shared on Wednesday running for significantly less time; in this case, most of the videos are about 10 to 20 seconds long. Moreover, artifacts are introduced and image quality softens the longer Genie 2 needs to maintain the illusion of a consistent world.

DeepMind didn’t detail how it trained Genie 2 other than to state it relied “on a large-scale video dataset.” Don’t expect DeepMind to release Genie 2 to the public anytime soon, either. For the moment, the company primarily sees the model as a tool for training and evaluating other AI agents, including its own SIMA algorithm, and something artists and designers could use to prototype and try out ideas rapidly. In the future, DeepMind suggests world models like Genie 2 are likely to play an important part on the road to artificial general intelligence.

“Training more general embodied agents has been traditionally bottlenecked by the availability of sufficiently rich and diverse training environments,” DeepMind said. “As we show, Genie 2 could enable future agents to be trained and evaluated in a limitless curriculum of novel worlds.”

This article originally appeared on Engadget at https://www.engadget.com/ai/google-deepminds-genie-2-can-generate-interactive-3d-worlds-200708207.html?src=rss

HOT news

Related posts

Latest posts

Ethereum Worth Evaluation: ETH Defends $1.5K Help, However Weak Demand Places Restoration in Query

Ethereum continues to commerce inside a firmly bearish market construction regardless of displaying indicators of stabilization round a significant help zone. Whereas patrons have...

The way to set a customized alarm sound in your iPhone

You should use any sound you need for an iPhone alarm, and right here's learn how to set it up.

Google Gemini AI Predicts Jaw-Dropping Sandisk Inventory Worth by Finish of 2026

Google Gemini AI simply connected a quantity to Sandisk that treats one of many wildest charts and worth prediction of all the AI growth...

SpaceX Dominates as Tokenized Pre-IPO Buying and selling Quantity Surges 1,060%: CoinGecko

Buying and selling exercise in tokenized pre-IPO perpetual contracts surged sharply in Could 2026 after a number of months of subdued exercise, based on...

MiCA Deadline: New Guidelines May Drive 80% of Crypto Corporations Out of EU

The transitional grace interval below the Markets in Crypto-Property (MiCA) regulation formally ends throughout the EU on July 1, 2026. It signifies that any...

Want to stay up to date with the latest news?

We would love to hear from you! Please fill in your details and we will stay in touch. It's that simple!