Most of you will have used, or at least encountered, various "information criteria" when estimating a regression model, an ARIMA model, or a VAR model. These criteria provide us with a way of comparing alternative model specifications, and selecting between them.
They're not test statistics. Rather, they're minus twice the maximized value of the underlying log-likelihood function, adjusted by a "penalty factor" that depends on the number of parameters being estimated. The more parameters, the more complicated is the model, and the greater the penalty factor. For a given level of "fit", a more parsimonious model is rewarded more than a more complex model. Changing the exact form of the penalty factor gives rise to a different information criterion.
However, did you ever stop to ask "why are these called information criteria?" Did you realize that these criteria - which are, after all, statistics - have different properties when it comes to the probability that they will select the correct model specification? In this respect, they are typically biased, and some of them are even inconsistent.
This sounds like something that's worth knowing more about!