“Language sets everyone the same traps; it is an immense network of easily accessible wrong turnings.” - Ludwig Wittgenstein, Culture & Value (1980)
It is often said that priors in Bayesian statistics are ‘beliefs’. I maintain this analogy is misguided and unhelpful.
The nature of ‘belief’ remains controversial but can be said “to refer to the attitude we have, roughly, whenever we take something to be the case or regard it as true.”1 Belief is often called a ‘propositional attitude’: we have some opinion about truth value of a proposition2. Hence, a belief is canonically a mental state of the form ‘that P’, where P is some proposition. If this all feels a bit esoteric it’s because, in a sense, it is. Beliefs just are pretty weird things. I want to, therefore, argue that drawing an analogy between priors and beliefs is unnecessary and confusing.
Gelman3 argues that we should see priors as representing the relevant ‘information’ that bears on a problem. Further, our choice of prior will be constrained by computational tractability and mathematical convenience4. So what do we mean by information? Information could be seen to reflect what is known about a problem. This might seem like a semantic sleight of hand; I’m slipping into talk of knowledge to somehow dissolve the issues around belief. Indeed, there are important differences between knowledge and belief. Rather than just being just an attitude, knowing is meant to satisfy some more stringent criteria. Important to a claim of knowledge is the idea of justification5. You don’t know something just because you belief it, you have to have some sort of justifying relationship with it. I might believe it’s raining outside but I don’t know it unless I look out of the window, whether or not it happens to be raining.
It is far preferable to think of priors as reflecting information as it highlights this idea of justification. If our priors reflect beliefs at all then it’s only those beliefs that we can justify – those beliefs that satisfy some criteria that has nothing to do with our personal mental states. Researchers are generally pretty familiar with having to justify claims to their peers. We, therefore, already have some reasonable grasp of what this ‘justification’ might practically look like.
Suppose you’re an experimenter, Jones, author of J-Theory. Your theory and experiments says that people will respond quicker to red shapes than blue shapes. Your rival, Smith, has proposed S-Theory. They argue that people respond more quickly to blue shapes than red shapes. As well as yourself and Smith there’s another group of researchers who find no difference between red and blue shapes.
Now suppose you want to do a new experiment with your red and blues shapes. You might very strongly believe that there is an effect in your favoured direction. However, if we take priors to be about ‘information’ then you would have to specify a prior that is different from your personal belief. The belief that you can justify, the thing you can claim to know, has to reflect all the information that bears on the problem. Given the inconsistencies in your blue-red-shape field, what you can claim to know is that the effect is either in your direction, Smith’s direction, or non-existent. Your prior on the difference between blue and red shapes might, for example, be a normal distribution centred on 0 with a standard deviation that reflects the variability between studies in the area. The point of this example is to show how a justifiable prior can and would differ from a personal belief.
The importance of justification is often lost when we talk about priors as being like beliefs. Yet, justification is the very thing that allows our beliefs to climb to the heady heights of knowledge. However, I would not even advocate calling priors knowledge. Knowledge is still a state of individuals, and remains far removed from what priors actually look like. Priors are typically specified as probability distribution and represent a compromise between computational tractability and mathematical convenience. It’s not clear to me that this is a remotely plausible way to cash out mental states like knowledge. It is, on the other hand, a tractable and sensible way to summarise the scientific information relevant to some problem.
In conclusion, I do not think that the mental states of individual researchers play a larger role in Bayesian statistics than any other method. Bayesian statistics reflect beliefs to the extent that researchers are, after all, people with beliefs. All our methodological choices in some sense reflect our beliefs. What really matters is our justification, and this has nothing to do with us as a subject.
“To put it even more succinctly, ‘the model’, for a Bayesian, is the combination of the prior distribution and likelihood, each of which represents some compromise among scientific knowledge, mathematical convenience and computational tractability.”
– Gelman & Shalizi (2013)
Appendix: is this all semantic?
It’s sometimes tempting to see these sorts of debates as semantic and therefore, in some sense, trivial. I would wholeheartedly accept that these claims are semantic; they concern what we mean by ‘prior’ ‘belief’ etc. However, I have never understood in what sense this makes them trivial. After all, what it means to be good could be called equally semantic. I would not want to concede that this is a trivial matter. What our concepts mean, how they are used, what follows from our definitions, shapes the way we conceive of what we do. I remain unconvinced that this is a trivial matter.
Where a proposition is whatever the thing is that sentences mean↩
Gelman, A., & Shalizi, C. R. (2013). Philosophy and the practice of Bayesian statistics. British Journal of Mathematical and Statistical Psychology, 66(1), 8–38. http://doi.org/10.1111/j.2044-8317.2011.02037.x↩