- AI For Humans - The Newsletter
- Posts
- AI World Simulators Are The Next Big Thing đ đ€
AI World Simulators Are The Next Big Thing đ đ€
How World Labs' 'image-to-world' announcement & Google DeepMind's Genie 2 are moving AI video forward
Today on AI For Humans The Newsletter!
World Labs & Genie 2 shock & surprise us
The agents about to take over your browser
And is the end in sight for podcasters like us?? đ
Plus, our canât-miss AI feature of the week!
Welcome back to the AI For Humans Newsletter!
As we prepare for OpenAIâs 12 Days of Shipmas (this morning they announced 12 straight days of live streams showing off new product), weâve been delighted by two big announcements in the âprompt-to-worldâ space.
First, Dr. Fei-Fei Li (the pioneering AI scientist behind ImageNet and more) and her new company World Labs just came out of stealth to show off their new âimage-to-worldâ model which will allow you to take any image and turn it into an explorable environment.
These environments started with a single imageâŠ
Weâve seen a few companies try this before but the depth of what's possible in World Labs has us giddy with excitement. Not only can it create new environments but it can re-light those environments, add physics and all sorts of other stuff. This is a HUGE step towards AI-infused video game worlds and better simulating OUR world so AIs can learn better and faster. Def take a spin through the demo in the blog post.
And then Google DeepMind announced Genie 2, their âprompt-to-video-gameâ engine which takes everything we found fascinating about Genie 1 and dialed it up to Genie 11 (we are sorry for this).
Genie 2 moves into 3D worlds
Genie 2 moves from simple 2D video game worlds to 3D worlds which, as you can imagine, leads to a much richer array of worlds to be simulated and experienced.
Why does all this matter?
Well, much like OpenAI initially said Sora wasnât just a video tool but a âworld simulatorâ. World simulations like Genie 2 or World Labs offer a sandbox where AI can learn, experiment, and grow in ways that mimic real-world interactions. These environments help AI develop deeper understanding of context, behavior, and decision-making without the risks or limitations of physical testing. For AI development, itâs like upgrading from watching videos to stepping into a fully immersive, interactive classroomâunlocking new possibilities for innovation in creativity, training, and problem-solving.
And, ultimately, might just make âprompt-to-video-gameâ a reality.
Apologies for the newsletter coming in a day late but boy is there a lot on the horizon.
-Kevin & Gavin
3 Things To Know
ElevenLabs Launches a âGenFMâ Podcast Feature
Similar to NotebookLMâs audio overview, ElevenLabsâ new GenFM promises to turn your PDFs, ebooks or otherwise into personal podcasts. Itâs another signal audio overviews may be here to stay. This one doesnât come with customizability yet (recently introduced by NotebookLM) but once you layer in more voices and structure, this starts to look like a whole new media format.
OpenAI May Be Coming For Your Browser?
Chrome co-creator Darin Fisher recently left Arc to join OpenAI, adding to the rumor mill that OpenAI is building a browser of its own. Why? Agents. The self-driving browser. For an example of what this could look like, look no further than Do Browser, shown here acing the play Creed from YouTube eval.
Get Ready to Yell At Your Computer
Conversational AI platform Hume shared their own spin on a voice controlled.. computer controlled⊠computer. The demo is incredibly compelling. Humeâs party trick is greater emotional understanding of the user in a super slick, low latency conversation. Itâs impressive, and points to the emotional intelligence required to navigate both
high-stakes spreadsheeting and getting no-scoped by your nephew in Roblox.
We đ This - ElevenLabs Speech-To-Speech
Itâs easier than ever before to create multi-character, single actor visual performances with tools like Runwayâs Act-One and Hedra. But can a single actor drive multiple voice performances?
We have the technology, and itâs called Speech-To-Speech.
Now a year from itâs original release, the tech has gotten really good. Plus ElevenLabsâ growing library of voices makes it easier than ever before for you to deliver a performative take in your own voice, only for it to then be generated in an entirely different voice.
This support article has some great examples thatâll get your wheels turning.
Are you a creative or brand looking to go deeper with AI?
Join our community of collaborative creators on the AI4H Discord
Get exclusive access to all things AI4H on our Patreon
If youâre an org, consider booking Kevin & Gavin for your next event!