Google DeepMind is poised to debut a new version of its new AI world model called Genie 3, which has the capacity to generate real-time 3D surroundings for interaction for humans and AI agents. The model promises that customers will engage with these worlds for longer than in earlier versions, and notably will maintain the site of objects even while users turn their attention away from them.
A class of artificial intelligence systems meant to simulate environments for a range of uses, including education, leisure, and the training of robots or AI agents. World models utilising world models, consumers give a prompt, and the program creates a navigable area somewhat like a video game. Unlike conventional video games dependent on painstakingly constructed 3D assets, these environments are generated whole by artificial intelligence. Heavily investing in this area, Google demonstrated Genie 2, which might generate interactive worlds based on images, in December and is building a special team under a former co-leader of OpenAI’s Sora video generation tool.
Though technology has advanced, present models have certain drawbacks. A recent interactive video from a company backed by Pixar’s co-founder felt like passing Google Street View that had been twisted, where things transformed and morphed unpredictably. Google’s new AI model Genie 3 seems to be a major step forward in this industry. Users will now be able to create worlds with prompts that enable a few minutes of continuous interaction, a major advance from the little 10-20 seconds of engagement possible with Genie 2, as observed in a blog post.
A quick updated guide on ChatGPT-4 vs. Gemini Pro’s best app use cases built for 2025.
According to Google, Genie 3 can keep visual memory of surroundings for around one minute; thus, if a user looks away from an item in the world and then turns back, items such as wall paint or chalkboard writing will stay in their initial locations. Furthermore, running at 24 frames per second, the worlds will have resolutions of 720p.