# Derivation of Bayes’ Theorem

I’m hoping, when I get round to it, to give a full explanation of Bayes’ Theorem, it’s use and different forms of it. For now, since I’ve just been formalising the derivations in preparation for a paper I’m writing, I thought I might as well type it up, and no reason not to share in case people want to have a look.

So we begin with a basic axiom of probability theory:

$P(A \cap B) = P(B) \cdot P(A|B)$

This is to say that the probability of both A and B being the case is the probability of B being the case multiplied by the probability of A given B. One can also put this the other way round:

$P(A \cap B) = P(A) \cdot P(B|A)$

Since both these latter halves are equal to $P(A \cap B)$, it follows that:

$P(B) \cdot P(A|B) = P(A) \cdot P(B|A)$

Now, if we divide both sides by $P(B)$, we get:

$P(A|B) = \frac{P(A) \cdot P(B|A)}{P(B)}$

This, in short, is Bayes’ Theorem, which says that the probability of A given B is equal to the probability of A, multiplied by the probability of B given A, divided by the probability of B.

Now, to get to the odds form, we need to do a few more things: firstly, we note that:

$P(B) = P(A) \cdot P(B|A) + P(\sim A) \cdot P(B|\sim A)$

And so we can deduce that:

$P(A|B) = \frac{P(A) \cdot P(B|A)}{P(A) \cdot P(B|A) + P(\sim A) \cdot P(B|\sim A)}$

The odds form allows us to compare $P(A)$ and $P(\sim A)$ directly. To get further towards this, we can go through the whole process again, this time using $P(\sim A)$ in place of $P(A)$. This will eventually give us:

$P(\sim A|B) = \frac{P(\sim A) \cdot P(B|\sim A)}{P(\sim A) \cdot P(B|\sim A) + P(A) \cdot P(B|A)}$

From this, we find we can divide $P(A|B)$ by $P(\sim A|B)$, which gives us the following:

$\frac{P(A|B)}{P(\sim A|B)} = \frac{(\frac{P(A) \cdot P(B|A)}{P(A) \cdot P(B|A) + P(\sim A) \cdot P(B|\sim A)})}{(\frac{P(\sim A) \cdot P(B|\sim A)}{P(\sim A) \cdot P(B|\sim A) + P(A) \cdot P(B|A)})}$

This may look confusing, but we can note that the denominators of both the top half and of the lower half are the same – if we multiply top and bottom by that denominator, we get the much simpler equation:

$\frac{P(A|B)}{P(\sim A|B)} = \frac{P(A) \cdot P(B|A)}{P(\sim A) \cdot P(B|\sim A)}$

Separate out some of the terms on the right hand side, and you get:

$\frac{P(A|B)}{P(\sim A|B)} = \frac{P(A)}{P(\sim A)} \times \frac{P(B|A)}{P(B|\sim A)}$

And you now have the odds form of Bayes’ Theorem! Perfect. As I said, I won’t go into its use or anything here: this is purely to provide the formal derivation for future reference. I hope you won’t be too disappointed, therefore, if you find that there is nothing at all interesting to you in this post.

Calum Miller

## 3 thoughts on “Derivation of Bayes’ Theorem”

1. By the way I love Bayes’ theorem and have used it to argue for the resurrection and to show one of Richard Dawkins’s fallacies.

2. thinkingchristian

Calum, can you show how you got to your first equation in the derivation of the odds form, the first one with ~A in it? Thanks.

3. Thanks for the comments. Thinkingchristian, the probability of P(B) can be looked at in terms of a probability tree. This one I’ve found illustrates it quite nicely: http://www.gcsemathstutor.com/images/information/info-tree1.jpg

Suppose P(B) is the probability of getting a blue on the 2nd choice. And suppose P(A) is the probability of getting a blue on the 1st choice. There are two ways we can get a blue on the 2nd choice: by getting a green and then a blue, or by getting a blue and then another blue. The probability of P(B) is the sum of the probabilities of the different ways we can get a blue on the 2nd choice: in this example, we have only these two ways of doing so.

The probability of finding a blue on the 2nd choice via getting a blue on the first choice, is the probability P(A) multiplied by P(B|A).

In contrast, the probability of finding a blue on the 2nd choice via getting a green on the first choice, is the probability P(~A) (i.e. the probability of getting a green), multiplied by the probability of getting a blue after getting a green, which is P(B|~A). Thus, it is equal to P(~A) x P(B|~A).

The sum of these is the total probability of getting a blue on the 2nd choice, and so is equal to [P(A) x P(B|A)] + [P(~A) x P(B|~A)]. And there you have the equation! Hope this helps.