Student’s guide to Bayesian Statistics

Chapter 4: Likelihoods

in Bayesian inference, we wish to determine a posterior belief in each set of parameter values. This means that in Bayesian inference we instead hold the data constant, and vary the parameter values. Confusingly, in this context, our probability model no longer behaves as a valid probability distribution. In particular, the distribution no longer sums (for discrete distributions) or integrates (for continuous distributions) to 1. To acknowledge this distinction in Bayesian inference, we avoid using the term probability distribution in favour of likelihood.

Frequentist inference also proceeds from a likelihood function. Instead of using Bayes’ rule to convert this function into a valid probability distribution, Frequentists determine the parameter values that maximise the likelihood. Accordingly, these parameter estimates are known as maximum likelihood estimators and, because they maximise the likelihood, they are the values of model parameters that result in the greatest probability of achieving the observed data sample. Bayesian posterior distributions can be viewed as a weighted average of the likelihood and the prior.

equivalence principle

We call the above the equivalence relation since a likelihood of θ for a particular data sample is equivalent to the probability of that data sample for that value of θ.