Bayesianism is both a discipline and a world view, because it is a theory of knowledge, and knowledge tells us about the world. Bayesianism tells us what knowledge is and how we can create it.
Traditional views of knowledge are focused on "truth", which turns out to be difficult to define and not very interesting. I often liken it to the philosopher's stone of the alchemists, which was sought after for thousands of years and turned out to not exist, and in retrospect doesn't look especially valuable. Wealth creation depends on markets, democratic governments, and corporate organization (a topic for another time) and longevity depends on sanitation and scientific medicine. A magic rock doesn't come into it.
One popular modern theory of knowledge is the notion that "knowledge" is "justified true belief": you "know" something only if you believe it to be the case, and it is in fact the case, and you are justified in believing it is the case.
There are some artificial problems--called Gettier problems, although they were discussed in Plato--where the belief is accidentally true.
For example--to use a case Bertrand Russell discussed--Alice might believe that it is 2 PM because she sees a clock reading 2 PM, and it is in fact 2 PM, but the clock stopped twelve hours before. So according to the standard account, Alice has a belief that is true and justified, but her justification is unrelated to the truth of the proposition "It is 2 PM".
As with many of Russell's paradoxes, this one turns on a subtle and unnoticed shift in perspective, in which who knows what is clouded.
Alice believes it is 2 PM because for some reason she thinks the clock is precisely accurate and perfectly reliable.
We know--because Russell has told us--that the clock stopped twelve hours before.
We also know that clocks, especially the mechanical clocks of Russell's time, don't always tell good time. Alice knows this too. Doesn't she?
So what is Alice actually justified in believing, given the evidence she has? Unless she has just awoken from a drunken stupor after a long night of debauchery (in which the clock got damaged, maybe... Alice gets up to some pretty wild stuff offstage in these Russellian fantasies, I think) she has a strong prior belief that it is early afternoon when she looks at the clock. Seeing the clock hands about where she expects them to be increases the plausibility of that prior belief, and that's all.
What Alice knows is not "It is 2 PM". She knows "Given the evidence I have it is more plausible than not it is pretty close to 2 PM."
The former is a "floating proposition": an idea detached from evidence, and without a knowing subject, unhinged from both reality and the individual who holds it. A floating proposition has no external-world content because it is unrelated to any operational relationship to reality. It is also disembodied: anyone could be saying "It is 2 PM", which is a useful shorthand for a much longer statement that embeds that belief in a network of evidence that a particular person possesses and which justifies it to that person. Not to us or anyone else. So any argument, like Russell’s, that approaches the problem by switching perspectives between subjects, doesn’t address “knowledge”, but something else.
This is the Bayesian view of knowledge: what a particular person knows is not any isolated proposition, but a network of evidence and ideas, where the evidence is exclusively in the form of probability distributions that particular person is aware of. Sometimes the distributions are narrow and we can ignore all the Bayesian machinery and do deduction. But most of the time we can't.
What we know is not an isolated set of free-standing conclusions that float like wandering planets through the impersonal void, lit by the light of "truth" from some distant central sun, but a complex, somewhat elastic, embodied state of a particular individual with a particular history of encounters with data, of which I will have more to say later.
Reality is still real, in this view. The world is some particular way and we can know stuff about it. But our knowledge of it is not a passive reflection of that reality, which is what people have traditionally meant by "truth": a set of ideas that are in exact correspondence with the world as it is.
This is an idea people have a hard time getting their heads around. But consider: the concept of the philosopher's stone was used to organize and direct our ideas about the transformations of matter for thousands of years, but when it proved to be a fantasy no one said, "I guess that means we can do chemistry however we like!"
Reality still constrained us. Even moreso than previously, in fact, because the idea of the philosopher's stone was a powerfully distorting influence on chemical knowledge, and because there was no such thing skilled rhetoricians could use it to fool their gullible patrons into funding all kinds of crazy research. From a falsehood like “truth” anything follows. Not so from knowledge.
Unfortunately when you point out that the idea of "truth" bears the same relation to "knowledge" that the philosopher's stone bears to chemistry a whole lot of people say, "That means I can believe whatever I want!!!"
You wanna believe whatever you want? Well…
Matter and the rules that describe its transformations still existed when the Philosopher’s Stone turned out to not exist. Likewise, in the absence of "truth", knowledge still remains, and we find it by the disciplined application of Bayesian updating, which is a specific process for altering how plausible we think an idea is when we get new evidence.
Most people have an intuitive grasp of this process, but they miss one crucial step, and that often turns Bayesian updating into an exercise in confirmation bias.
It works like this:
Bayes rule says that the more likely a rare event would be given a particular idea, the more plausible the idea becomes when that event happens. An "event" doesn't have to be a single instance: it is more likely the outcome of a series of observations in the form of a probability distribution.
The problem is that people often make "Bayesian-feeling" arguments like this:
If ulcers are caused by bacterial infection, it is likely that treating ulcers with antibiotics will work.
Treating ulcers with antibiotics works. That is: after antibiotic treatment ulcers tend to go away.
Therefore the idea that ulcers are caused by bacterial infection becomes more plausible.
Or not.
Because the example I've given tells us literally nothing about how much of a difference the evidence should make to our beliefs. Maybe a lot. Maybe practically none.
The problem is that I've made no mention of how often ulcers go away just randomly, and this "base rate" is the only thing that separates Bayes' rule from pure confirmation bias.
Is recovery from ulcers rare? Without that crucial piece of information, this argument is mostly noise.
Now in fact such spontaneous recoveries are rare enough that after a long systematic investigation the idea that uclers are caused by bacteria and can be cured with antibiotics turned out to be more plausible than the alternatives. But this conclusion depends on knowing the base rate. Without it, we know very little.
We see this “base rate neglect” in political reasoning all the time, with each side pointing out the failings of the other with no mention of how often hypocrisy, dishonesty, or corruption happen regardless of the party in power. There are obvious cases where a political party goes beyond the pale, normalizing armed insurrection aimed at overthrowing the results of a democratic election, say, but until recently those were rare events.
By far the majority of political malfeasance is such that you can't identify the party responsible by simply listing the act in question, which is a good test of when neglect of the base rate is coming into play.
Base rate neglect is also common in arguments that involve evolution by variation and natural selection.
For example, there is a claim that the existence of the "furin cleavage site" in covid-19 is evidence that the virus was engineered by humans because no such thing could occur in nature. This has multiple base-rate neglect problems.
For one, what is the base rate of similar mutations?
For two, has anyone even bothered to measure it?
The absence of historical evidence for something that has only recently become interesting is itself neglecting the base rate. Saying "No one has seen this kind of furin cleavage site before" is only interesting if people have for some reason gone looking for that kind of furin cleavage site before, and why would they?
It turns out that when someone did go looking, they found that very similar—but not identical—beneficial adaptations have in fact happened multiple times in the broader coronavirus linage. The most that a perfectly reasonable alternative view of these and other data can say is:
While a natural origin is still possible and the search for a potential host in nature should continue, the amount of peculiar genetic features identified in SARS-CoV-2′s genome does not rule out a possible gain-of-function origin, which should be therefore discussed in an open scientific debate.
However, isolated insertions cannot be considered separately from the evolution of the whole virus, which is a problem involving a practically infinite number of minor variants as the 29,000 bases of the virus genome sequence get imperfectly copied from generation to generation.
More broadly, the argument: "I can't understand how this could naturally evolve therefore it was engineered" is an argument we've seen for over a century in multiple cases where eventually we did understand how the feature in question evolved naturally. The only difference is in this case the agent is supposed to be human rather than divine, which is impossible to rule out entirely because humans actually exist.
Viral evolution is capable of exploring billions of alternative sequences per day. We know from the study of many other viruses that it can find solutions to the problem of more efficient replication in ways that are difficult for us to understand. The machinery of the cell--and therefore the machinery of viruses that prey on it--is far beyond the wackiest Rube Goldberg machine the human imagination can come up with.
So to say, "I don't understand how a complex structure like the eye evolved, therefore Creationism" is not a Bayesian argument. Replacing "the eye" with "the furin cleavage site" and "Creationism" with "engineered virus" does not change the argument, because we know the base rate of really improbable stuff in evolved systems is astonishingly high. One might even say that the principle of natural selection is the conversion of the improbable into the probable via the mechanism of differential reproductive success.
The very idea of looking for a “signature of an engineered virus” in the genome of covid-19 depends on base rate neglect. Because the base rate of “evolved things we don’t understand” is extremely high the influence of the viral genome on the plausibility of “the virus was engineered” can never be more than marginal.
If we want to be good Bayesians, we have to account for the base rate, and always ask “Would this event be rare if my favourite idea was bollocks?” or all we are doing is bias confirmation.
Thank you for diving in to explain the mechanisms for updating beliefs based on data …and love that clip from Firefly! 😆
I've spent the last week watching the Amazon series Jack Ryan and Reacher. They're both involve detective stories. I was thinking about commenting that when it comes to crime evidence, the idea of a logical investigation is so widely accepted it is obvious. But I'm not really sure this is true.