It's not hallucination, it's confabulation. Very similar in its nuances to stroke patients.
Just like the pretrained model trying to nuke people in wargames wasn't malicious so much as like how anyone sitting in front of a big red button labeled 'Nuke' might be without a functioning prefrontal cortex to inhibit that exploratory thought.
Human brains are a delicate balance between fairly specialized subsystems.
Right now, 'AI' companies are mostly trying to do it all in one at once. Yes, the current models are typically a "mixture of experts," but it's still all in one functional layer.
Hallucinations/confabulations are currently fairly solvable for LLMs. You just run the same query a bunch of times and see how consistent the answer is. If it's making it up because it doesn't know, they'll be stochastic. If it knows the correct answer, it will be consistent. If it only partly knows, it will be somewhere in between (but in a way that can be fine tuned to be detected by a classifier).
This adds a second layer across each of those variations. If you want to check whether something is safe, you'd also need to verify that answer isn't a confabulation, so that's more passes.
It gets to be a lot quite quickly.
As the tech scales (what's being done with servers today will happen around 80% as well on smartphones in about two years), those extra passes aren't going to need to be as massive.
This is a problem that will eventually go away, just not for a single pass at a single layer, which is 99% of the instances where people are complaining this is an issue.