One of the most enjoyable books I’ve read recently and all the more so for having found it by chance in a bookshop in Sweden. The cover art is a belter.
There’s a contagious enthusiasm and wonder at the nature of Bayesian statistics in the writing and it has just the right amount of history, humour, and science to keep it varied without dragging on too much on any of them.
Rather than the usual spiral into repetition that books of this nature tend to lean towards, it got more and more interesting as it went on. The later parts which contemplate the brain’s Bayesian nature and introduce the revelatory concept of our entire existence being a form of prediction error minimisation were exceptional.
It lived up to its early promise of revealing Bayes’ theorem in action wherever you look, and provided a lot more food for thought than I expected going into it.
Some memorable things:
-
The shortcomings of frequentist statistics made apparent in the replication crisis of the early 2010s; without taking prior probabilities into account you can make study outcomes appear significant (p<0.05) by running them lots of times and/or chopping up the data in lots of ways.
-
The Bayesian brain theory in which our perception is the result of predictions made by a probabilistic model of the world in higher regions of the brain clashing with the information from sensory organs reporting on discrepancies with them.
-
Schizophrenics maybe having inaccurate predictions about the world which therefore leads them to fail to subtract their own actions from those occurring around them, e.g. they can’t accurately predict the movement outcome of motor neurons firing and so experience bodily movement as something happening “to” them.
-
That all behaviour and desire can be modelled as an effort to minimise prediction error, either by updating our priors or by changing the world to make us less wrong (moving around, eating, etc.).
Highlights
When we make decisions about things that are uncertain - which we do all the time - the extent to which we are doing that well is described by Bayes' theorem. Any decision-making process, anything that, however imperfectly, tries to manipulate the world in order to achieve some goal, whether that's a bacterium seeking higher glucose concentrations, genes trying to pass copies of themselves through generations, or governments trying to achieve economic growth; if it's doing a good job, it's being Bayesian.
You know that if a woman has cancer the mammogram will correctly identify it 80 per cent of the time (it's 80 per cent sensitive) and miss it the other 20 per cent. If she doesn't have cancer, it will correctly give the all-clear 90 percent of the time (it's 90 per cent specific), but give a false positive 10 per cent of the time.
You get the test. It comes back positive. Does that mean there's a 90 per cent chance you've got breast cancer? No. With the information I've given you, you simply don't know enough to say what your chances are.
What you need to know is how likely you thought it was that you had breast cancer before you took the test.
What Bayes' theorem tells you is how much you should change your belief. But in order to do that, you have to have a belief in the first place.
During the trial of O.J. Simpson, the former American football star, for the murder of his wife Nicole Brown Simpson, the prosecution showed that Simpson had been physically abusive. The defence argued that 'an infinitesimal percentage - certainly fewer than 1 in 2,500 - of men who slap or beat their wives go on to murder them' in a given year.
But that was making the opposite mistake to the prosecutor's fallacy. The annual probability that a man who beats his wife will murder her might be 'only' one in 2,500. But that's not what we're asking. We're asking if a man beats his wife, and given that the wife has been murdered, what's the probability it was by her husband?
All decision-making under uncertainty is Bayesian - or to put it more accurately, Bayes' theorem represents ideal decision-making, and the extent to which an agent is obeying Bayes is the extent to which it's making good decisions.
One of Bayes' great contributions to probability theory was not mathematical, but philosophical. So far we've been talking about probability as if it's a real thing, out there in the world. [...]
That is, for Bayes, probability is subjective. It's a statement about our ignorance and our best guesses of the truth. It's not a property of the world around us, but of our understanding of the world.
Frequentist statistics do the opposite of what we've been talking about. Where Bayes' theorem takes you from data to hypothesis - How likely is the hypothesis to be true, given the data I've seen? - frequentist statistics take you from hypothesis to data: How likely am I to see this data, assuming a given hypothesis is true?
A more technical objection was that of the mathematician and logician George Boole, who pointed out that there are different kinds of ignorance. A simplified example taken from Clayton: say that you have an urn with two balls in it. You know the balls are either black or white. Do you assume that two black balls, one black ball, and zero black balls are all equally likely outcomes? Or do you assume that each ball is equally likely to be black or white?
This really matters. In the first example, your prior probabilities are 1/3 for each outcome. In the second, you have a binomial distribution: there's only one way to get two black balls or zero black balls, but two ways to get one of each. So your prior probabilities are 1/4 for two blacks, 1/2 for one of each, 1/4 for two whites.
Your two different kinds of ignorance are completely at odds with each other. If you imagine your urn contains not four but 10,000 balls, under the first kind of ignorance, your urn is equally likely to contain one black and 9,999 whites as it is 5,000 of each. But under the second kind of ignorance, that would be like saying you're just as likely to see 9,999 heads out of 10,000 coin-flips as you are 5,000, which is of course not the case. Under that second kind of ignorance, you know you're far more likely to see a roughly 50-50 split than a 90-10 or 100-0 split in a large urn with hundreds or thousands of balls, even though you're supposed to be ignorant.
But the underlying problem of Bayesian priors is a philosophical one: they're subjective. As we said earlier, they're a statement not about the world, but about our own knowledge and ignorance.
Out of respect for the author's work, this content is truncated. To view it, please enter the code below.