No One Expects Bayes' Paradox!

the value and danger of surprising information

Dec 03, 2021

I sometimes liken pre-modern ideas of knowledge to alchemy. Alchemists were trying to find a key--which they called the Philosopher's Stone--that would allow them to transmute base matter into gold, and transient life into immortality.

They were hung up on the idea of "perfection" and "purity", and while they discovered many things, they never learned how to do the impossible. So far as I know, no one today looks at the amazing results of modern chemistry--which is what alchemy became when it grew up--and says, "Yeah, but no immortality yet, and why can't you transmute this lump of rock into gold?" We have, for the most part, accepted that searching for immortality and transmutation aren't really all that interesting because they aren't really all that possible. We don't go around pining and complaining and critiquing chemistry for not doing the impossible, although it wouldn't surprise me if in the private correspondence of the last alchemists we could find such sentiments.

The Bayesian revolution is still young--it happened in the '80s and '90s, although the seeds were planted much earlier in the 20th century--and we still have a lot of "Philosopher's Stone" style ideas about knowledge kicking around, the big ones being "truth" and "certainty", neither of which are possible or interesting. People still complain that we can't get them, though. But really, not being able to achieve the impossible and uninteresting isn't a bad thing, particularly because we get to exchange them for something better.

When we traded in alchemy for chemistry we got the greater part of the technology of the modern world. Fertilizers, vastly improved smelting and metalworking and alloys, plastics, glasses, and more. Alchemy would have given us gold and immortality, but what would those be worth without Pyrex and LEDs? Think about it: alchemists were reaching for huge piles of gold, which would be worth very little because it would no longer be rare, and a vastly extended lifespan in Medieval conditions, in which the greatest lord didn't have access to dental anesthesia.

In the same way, the Bayesian revolution traded truth and certainty for knowledge and robustness. Truth is a fragile illusion: a belief that we can't imagine having evidence against, until we do. Then it crumbles into dust, because the only two states that are allowed by traditional philosophy are true and false. Furthermore, because truths depend on each other, one ugly fact can shatter a whole brittle system of belief.

The advantage of a "truth model" is that if you build one its stiffness means that it can be left alone and will remain standing. It's a statue, not a garden.

Bayesian knowledge is a garden, a living ecosystem where ideas aren't true or false but more-or-less plausible based on the evidence we have. Ideas and their plausibilities form an interconnected network, as they do in traditional philosophy, with some propositions depending on others. But while in traditional philosophy the connections are rigid and inflexible and either present (true) or absent (false), Bayesian connections are flexible, and either stronger (more plausible) or weaker (less plausible) but never perfectly rigid or entirely absent.

Some branches in this network are so sturdy we can treat them as "true" for convenience, and some are so ethereal that they may as well be "false", but we always remember those ideas are a crude approximation to reality. An architect may design a building using "static analysis", which treats all the members as rigid and inflexible, but in reality they know that's just a convenience to make the problem tractable, and at the end of the day if they don't get the dynamic analysis right the building will have problems when the ground begins to shake.

The Bayesian equivalent of the ground shaking is the discovery of knew things, new facts, new observations. When that happens, we have to update our network, and we have to do so using Bayes' rule or something roughly like it, which can be crudely paraphrased as: "The more surprising a fact is the bigger difference it should make to our beliefs."

Which is not really how we tend to react to surprising new facts, is it?

This is what I call "Bayes' Paradox": the fact that the more surprising new evidence is, the more likely we are to reject it or ignore it rather than make the sweeping changes to our system of beliefs that it would otherwise require. Surprising new information is both the potentially most valuable and potentially most dangerous.

There are a lot of very good reasons for this behaviour. One is that changing our ideas--tending the garden of our mind--takes work, and if we don't do it habitually we'll likely find ourselves in a place that's tangled, overgrown, and inaccessible. In the worst case, it'll be full of statues: brittle, inorganic, ideas that didn't get there as a result of updating from the evidence, and which can only be removed with a hammer and chisel, which are not gardening tools at all.

This is not a good reason for ignoring surprising things, and in fact ignoring surprising things is often how people end up with their garden full of statuary in the first place.

The second reason is that people lie and make mistakes, and the so when we are incorporating a new idea into our life we have to always remember that the propositions, "This new evidence is the result of a mistake (mine or someone else's)" and "This new evidence is the result of a lie" are always pretty plausible, and the more surprising an idea is the more plausible they are.

This is because most of us aren't entirely stupid: if we're adults who have been wandering around bumping into pieces of the world for a while it's likely we've incorporated a lot of evidence into our network of knowledge, and if all goes well then the odds of us having missed something really surprising should go down with time. The purpose of knowledge is to allow us to form expectations and make predictions, and if something is really surprising it means we either must not have encountered it previously, or somehow missed it if we did.

On the other hand, of all the cognitive biases we are subject to, confirmation bias is by far the most prevalent, so it's possible something is really surprising because we've been slowly drifting off the real axis for a long time due to persistent and consistent misinterpretation of evidence in defence of some particularly grand bit of statuary we inherited from our parents, or put up as a protection from trauma, or similar non-Bayesian reasons. Rather than pruning that tree and training that vine when new evidence showed up in the past, we've done the opposite, and now we have a whole garden bed that is full of wild distortions to make room for that favoured statue.

So we're presented with a conundrum: there are both good and bad reasons to accept and reject any really surprising bit of information. The good reason for acceptance is it could radically improve our knowledge. The bad reason for acceptance is that we've uncritically absorbed a mistake or a lie. The good reason for rejection is that it may well be a mistake or a lie. The bad reason for rejection is we're protecting a statue that doesn't belong there.

As I'm fond of pointing out, science is more of an art than a science, and Bayes' paradox illustrates how taste and good judgement will always be with us. Bayesian thinking isn't automatic or algorithmic. It's nuanced and rich and leaves a lot of room for principled disagreement.

How we resolve Bayes' paradox determines what kind of thinker we are, and there's probably a taxonomy of "thinkotypes" that if I were clever I could commercialize in some kind of Myers-Briggs style horoscope chart, although I'd probably have to come up with a better name than "thinkotype". Suggestions welcome, but be warned: if you make one I like I'll just steal it.

The challenge in assessing really surprising new information is that we have to evaluate it at least somewhat independently of what we already know. Otherwise, we'd just reject it out of hand, because the fact that it's surprising means it has low plausibility based on our existing beliefs. One of the most important things about Bayes' rule is it can be run both ways, allowing new facts to change our knowledge, and our knowledge to evaluate new facts. If all we do is the latter, we have an iron-clad defence against learning, which is also probably a saleable product.

But by the same token, we have to not just throw everything we know away.

Ergo: Bayes' paradox. We want to incorporate really surprising information because it has the power to teach us a lot, but we don't want to incorporate really surprising information because it's more likely to be wrong, but we don't want to reject really surprising new information because we want to avoid confirmation bias but we do want to reject really surprising new information because...

If we aren't careful, we'll end up making one of the classic blunders, like starting a land war in Asia, or going up against a Sicilian where death is on the line.

I'll have more to say about how to resolve Bayes' paradox in future. In the meantime, keep updating your knowledge based on new evidence, but keep in mind that the more surprising the evidence is, the more careful you should be, but the more valuable the potential outcome!

If you think others will like this please share!

And if you want weekly musings on knowledge and reality, please subscribe!

World of Wonders

No One Expects Bayes' Paradox!

the value and danger of surprising information