- AI For Humans - The Newsletter
- Posts
- OpenAI Might've Figured Out AI Hallucinations & That's A Big Deal
OpenAI Might've Figured Out AI Hallucinations & That's A Big Deal
Why the sticky problem of LLMs 'making stuff up' has to do with wanting to please humans and the inability to say "I don't know"

Today on AI For Humans: The Newsletter!
OpenAI might’ve figured out hallucinations
Apple’s Partnering With Google on AI Search?
Plus, an easy-to-make Nano Banana TikTok Trend
Welcome back to the AI For Humans newsletter!
To start, a personal confession: As a student, I was pretty lazy.
I was able to get through school without having to do a ton of work. I didn’t get straight As but I was able to keep myself in the A-/B+ range fairly easily. I’m not proud of this fact now and, had I realized what I might’ve been able to actually accomplish, had I put my mind to it, who knows what I’d be doing. Maybe still writing this newsletter.
The secret to my coasting success was my ability to convincingly make arguments, both in papers and in person, without having a lot of knowledge about the subject matter. I made sure I knew just enough to sound convincing. I was an amazing bullsh*tter.
And, it turns out, LLMs are really good at this too.
Hallucination (n):
A hallucination or artificial hallucination (also called confabulation, or delusion) is a response generated by AI that contains false or misleading information presented as fact.
In May of this year, the New York Times ran a story that got a ton of pick-up discussing how as AI advances further and further and, ostensibly, gets smarter, these large language model systems are hallucinating more not less.
This problem has plagued LLMs for some time and been noted by prominent haters as the reason that that LLMs alone will not lead us to AGI or Super Intelligence.
After all, if you can’t trust an AI system to give you the right answer on a simple question, how are you going to trust it to give you advice on your business? And this only compounds in the world where AI agents go off and do a bunch of work for us without oversight.
Turns out, LLMs Do Better When They Say “I Don’t Know”

OpenAI Found That (Suprise!) LLMs Shouldn’t Be Rewarded For Guessing
Here’s the big takeaway: hallucinations aren’t some weird glitch. They’re a product of how these models are trained. Right now, LLMs get rewarded for sounding smooth and confident, not for being correct. Which means they’re literally incentivized to guess, even when they have no idea.
Sound familiar? They’re incentivized to act like I did as a student. To bullsh*t.
The problem is built into the benchmarks. We’ve been grading them on style over substance, so the better they sound, the more likely they are to be rewarded, even if they’re just making things up.
OpenAI’s proposed fix is surprisingly obvious: stop punishing models for saying “I don’t know.” Instead, reward them for it. When you make honesty part of the training loop, the hallucinations drop and the answers get a lot more reliable.
The Potential Creative Downsides of ‘I Don’t Know’
One thing that’s important (to me, at least) is that one of the true upsides of hallucinations is that these models can be a bit more creative. I realize I’m unusual but I’m always a little impressed at the story the LLM tells me when it’s obviously making something up.
I have a weird hunch that hallucinations might be key to AI becoming more creative and less rational but I understand the market has very little interest in that.
In our work on AndThen (my start-up in the AI audio space), we’re very interested in how these models can push the story forward & surprise our users. Normal businesses, and to be clear, these large AI companies that depend on businesses for revenue, don’t want surprise.
As I mentioned last week, what they’re looking for is clear, actionable results.
My hope is that we get both.
After all, according to Anthropic’s Dario Amodei we’re in for quite the ride…
Anthropic CEO Dario Amodei says if exponential progress continues for 1–3 years, AI will cross the frontier of human knowledge.
A very small number of humans with swarms of AI agents will drive new scientific discoveries and economic activity.
“Then things really go crazy.”
— vitrupo (@vitrupo)
9:48 AM • Sep 7, 2025
That’s it for today. See you on Friday for the podcast!
- Gavin (and Kevin)
THIS WEEK: Salesforce announced it cut 4k jobs thanks to AI. Save yours! 👇
3 Things To Know About AI Today
Apple’s Working On AI Search
Bloomberg’s Mark Gurman is reporting (paywalled link) that Apple’s Siri might finally be entering into the AI search space in the next few months. We’ve been pretty frustrated with how badly Apple has fumbled the AI transition (have you EVER enjoyed the AI summaries feature?) but maybe this is how they right the ship.
One thing of note: They may be partnering with Google to do it. Which reminds me, have you seen Google’s stock price lately?
Forever 21 Brings AI Models To The Mainstream
We’ve talked on the show a fair amount about how the commercial world is being fully usurped by AI tools (see PJ Ace’s hugely successful commercial agency for proof) and now it looks like it’s coming pretty significantly to e-commerce.
Forever 21 is using Ai models that look Ai
We’re watching the first aggressive wave of Ai adoption hit fashion!
— Salma (@Salmaaboukarr)
2:56 PM • Sep 6, 2025
I did a little digging and while this hasn’t been confirmed by Forever 21, it’s clear that this has been coming to the fashion world for some time. Just last month, Guess ran an ad in the print edition of Vogue that featured an AI model. Some people found it incredible, others… weren’t happy.
An OpenAI Employee That Isn’t Sam Altman Worth Following: Roon
You’re probably not on X as much as we are and that’s likely a good thing. It’s a weird place now but one specific sub-genre that continues to be strong is AI Twitter.
For some reason, the entire AI space (researchers, influencers, companies) never really left & if you’re not following some of the best people there, you’re missing out. And Roon (see below) is one of our favorites.
Roon is an anonymous OpenAI employee who’s become one of the most interesting voices on AI Twitter. His posts mix insider knowledge with a strange, funny, almost literary sensibility that makes them stand out in a sea of corporate updates.
He’s also hinted at working on creative writing models inside OpenAI, which makes his perspective especially unique for anyone curious about where AI is headed in that space.
We 💛 This: The Charlie & Lola TikTok Trend
Google Gemini’s Nano Banana AI image model keeps surprising us but, as you might imagine, it’s also driving some significant trends on social media.
This week, we’ve got the ‘Charlie & Lola’ trend:
Another viral AI consumer trend has landed - “Charlie and Lola” characters
…with 10k+ videos in less than a day!
It’s crazy to me that these spread via prompts copy/pasted in TikTok comments - real opportunity to productize this
— Olivia Moore (@omooretweets)
6:47 PM • Sep 6, 2025
The prompt is actually really simple & super fun. While it’s clear I’m WAY too old to be doing this particular prompt (I’m pretty sure Charlie & Lola was something my daughters watched), I have debased myself below for your entertainment.
PS, this original photo is from Perm Week which, yes, we actually did at Late Night with Jimmy Fallon. Let me know if you think I need to go back to this look.

Prompt is below:
Transform the subject from the uploaded image into a character in the style of Charlie and Lola (children’s cartoon). Match the official cartoon look – thin sketchy outlines, flat colors, childlike proportions, playful hand-drawn charm, and simple textures. Retain the subject’s original clothing, hairstyle, facial features, accessories, skin tone, pose, and expression – but reinterpret them as if they belong in the Charlie and Lola world. Clothing should be simplified into flat shapes and bright colors, while keeping the overall outfit recognizable. Background: transparent to keep the focus on the character.”
Negative Prompt: “No realistic shading, no detailed rendering, no anime or manga style, no 3D modeling, no photographic textures.”
Are you a creative or brand looking to go deeper with AI?
Join our community of collaborative creators on the AI4H Discord
Get exclusive access to all things AI4H on our Patreon
If you’re an org, consider booking Kevin & Gavin for your next event!