The comparison between the way things are and the way things ought to be is one that is made frequently. Good ice cream should be inexpensive, if not a free, public good, but oftentimes it is quite expensive. Exercise should be something we all strive for — it makes us feel good and there are health benefits; however, it is often very difficult to find the motivation to make it to the gym. The conflict between an ideal and an observed reality is present in everyday life, and statistics is no exception. Such conflict exists in the interpretation of probability, in the comparison between the Bayesian approach and the Frequentist approach.
These two approaches or philosophies are the two arms of inferential statistics, the branch of statistics that allows generalizations to be made about entire populations of data based on observations of some amount of sample data. This breaks from descriptive statistics, which are information points about some tangible data set. Inferential methods allow for assumptions to be made about much larger sets of information without having to observe them all, but rather by taking information from a number of samples and inferring an outcome. An example would be estimating the average height of all adult women in South America by taking representative samples from a number of regions in the continent without having to measure all females over the age of 18. As you can imagine, this makes life considerably easier.
Frequentist vs. Bayesian
Frequentists rely on what is observed — “what is” — when determining the probability of an event. To a frequentist, probability is related to what is observed in the frequency of repeated random events. While rooting in long-term frequencies of information presented in repeated samples, frequentists produce p-values that tell us the probability of a false positive result based on data from an experiment.
For Bayesians, probability is a more general concept, which allows non-repeatable events to have probabilities assigned to them. Probability is more related to the certainty or uncertainty of events; Bayesians root in the degrees of belief and logical support for the probability of a given outcome — the “what should be”. Bayesians would start with an estimated and prior probability — information gained from past observations — and then apply Bayes theorem to narrow the probability distribution around the parameter. Use of prior probabilities is the main objection of frequentists towards Bayesians. For example, a bad prior probability can have significant negative impacts on the model
As Jake VanderPlas stated in his 2014 talk at the SciPy conference — frequentists consider models and the parameters for those models as fixed, while data varies around the models. Bayesians see data as being fixed while the models and their parameters shift according to the data. The dichotomy between the methodologies was further explained when he presented these two views of how uncertainties in findings are explained:
“If this experiment is repeated many times, in 95% of these cases the computed confidence interval will contain the true θ” — Frequentist
“Given our observed data, there is a 95% probability that the value of θ lies within the credible region” — Bayesians
What is present here is the difference in what varies vs what remains fixed. We can see that the frequentist allows the interval to vary while for Bayesians, it remains constant (the credible region). Additionally, the computed value of θ varies for Bayesians while holding fixed for the frequentist.
While the debate between the two philosophies rages on, it is important to separate the method of thought from the actual method of practice. WIth big data, we can begin to understand what prior statistics might be for a data set where we can then estimate prior quantities frequentistically to carry out Bayes calculations on the data.
Hopefully the above distinctions can help you understand that the following is true. The Bayesian will compute a fixed region that is considered to be credible. They will then determine the probability that a parameter will fit inside that interval. A frequentist will focus on a model parameter and then determine how well the test will generate an interval in which the parameter fits.