Bias Correction of Maximum Likelihood Estimators

12/09/2019 3-minute read

Introduction

The term bias is quite common in statistics to express a systematic error that can often be identified and even corrected. In the context of statistical inference the bias is related to an important property of estimators.

As I discussed in this post the maximum likelihood method is widely used in practice, mainly because its intuitive appeal and the properties it enjoys.

However, the statistical properties that make the maximum likelihood estimators (MLEs) attractive are mostly asymptotic properties, which means holds for large sample size. For instance, the MLEs have a bias of order \(\mathcal{O}(n^{-1})\), where \(n\) is the sample size, i.e., this bias decreases as sample size increases. Thus, for small or moderate sample sizes the MLEs can be highly biased

Definition

Let be \(\widehat{\theta}\) the MLE of the parameter \({\theta}\), then the bias of the estimator is given by \[\mathcal{B}\left(\widehat{\theta}\right) = \mathbb{E}\left({\widehat{\theta}}\right) - \theta\] where \(\mathbb{E}(\cdot)\) denotes the expectation with respect to the sampling distribution of the estimator \(\widehat{\theta}\).

Usually, it is not possible to determine the MLE through a explicit formula, then we obtained numerical estimate of \({\widehat{\theta}}\) from the maximization of the log-likelihood function. In these cases, it is natural to think that it is not possible to find the bias of \({\widehat{\theta}}\), but, there are at least three approaches in the literature for maximum likelihood bias correction.

In the next section, I will briefly describe such approaches, interested readers may found more technical details in my undergraduate research here.

The bias correction approaches

By using second degree Taylor approximation on the score vector, Cox e Snell (1968) developed a methodology that allows to obtain analytical expression for the bias of MLEs This method has been extensively explored in the literature for different statistical models. I and professor Josmar Mazucheli developed several works deducing analytical expressions for the bias of the parameters of different statistical models. The main highlight of this partnership is the paper mle.tools: An R Package for Maximum Likelihood Bias Correction, published in R Journal, where we evaluated the efficiency of the mle.tools R package for Cox-Snell bias correction on several probability distributions.

A second approach was proposed by Firth (1993) and it is known as preventive, because the author proposed to transform the score vector before obtain the maximum likelihood estimates. For technical details on this methodology I recommend the excellent book An Introduction to Bartlett Correction and Bias Reduction from professors Gauss and Cribari of UFPE.

Finnaly, it is worth mentioning that resampling methods such as Bootstrap and Jacknife are computational alternatives that can be used for bias estimation. A practical example can be found in my work here.