AI hallucination is unfixable, says OpenAI

New research confirms AI will always make things up. Here's why.

Paul Mah

19 Sep 2025 — 2 min read

Photo Credit: AI-generated/Gemini.

AI hallucination is unfixable, says OpenAI

It turns out that AI hallucination is a feature, not a bug. And oh, it's also somewhat unfixable, according to OpenAI in a new research paper.

For today's UnfilteredFriday, let's take a look at AI and its propensity to make things up out of thin air - and why you'd better get used to it.

A research paper published by four OpenAI employees earlier this month lays things out as they are.

It's a feature, silly

So why do AI models hallucinate despite the best efforts of top AI engineers? The paper uses a rigorous mathematical approach to explain why it keeps happening.

In case you wondered, OpenAI's own reasoning models hallucinate a fair bit: o1 at 16%, o3 at 33%, and o1-mini at 48%. But why? In layperson's terms, it's due to our current way of training AI, problems in the underlying data, and how mistakes accumulate over predictions.

What intrigued me was how the scoring systems of current benchmarks penalise "I don't know" responses, which invariably means we end up training LLMs to be confidently wrong.

Reducing hallucination

AI hallucinations are a mathematical reality and not fixable with better engineering. This means we'll need new governance frameworks and risk management strategies.

The researchers did suggest a way to reduce hallucination: by rewarding "appropriate expressions of uncertainty rather than penalising them."

Technically, this could be done by leveraging established methods of quantifying uncertainty. However, this would require significantly more computational resources. And yes, AI could end up telling you "I don't know" all the time - which would be quite annoying in non-critical scenarios.

Getting better with AI

For now, the average user who just wants to work faster and be more productive will need to work around hallucination.

In my view, quality outputs will only happen when we understand how LLMs work and properly harness the power of context. Quality of LLM output is the sum total of the quality of the AI model, the quality of the prompt, and the quality of context.

Given this framework, users should start with the most suitable AI model for the task at hand. They should iterate to get the best outcomes and experiment often with new approaches.

I'll write more about working around these limitations another day. For now, how do you use AI? And more importantly, how do you verify what it tells you?

You can find OpenAI's blog and a link to the research paper here.