Sunday, December 27, 2015

Bounds for the Pearson Correlation Coefficient

The correlation measure that students typically first encounter is actually Pearson's product-moment correlation coefficient. This coefficient is simply a standardized version of the covariance between two random variables (say, X and Y):

           ρXY = cov.(X,Y) / [s.d.(X) s.d.(Y)] ,                                                  (1)

where "s.d." denotes "standard deviation".

In the case of sample data, this formula will be:

          ρXY = Σ[(Xi - X*)(Yi - Y*)] / {[Σ(Xi - X*)2][Σ(Yi - Y*)2]}1/2 ,                 (2)

where the summations run from 1 to n (the sample size); and X* and Y* are the sample averages of the X and Y variables.

Scaling the covariance in this way to create the correlation coefficient ensures that (i) the latter is unitless; and (ii) it takes values in the interval [-1, +1]. The first of these two properties facilitates meaningful comparisons of correlations involving data measured in different units. The second property provides a metric that enables us to think about the "degree" of correlation in a meaningful way. (In contrast, a covariance can take any real value - there are no upper or lower bounds.)

Result (i) above is obvious. Result (ii) can be established in a variety of ways.

(a)  If you're familiar with the Cauchy-Schwarz inequality, the result that -1 ≤ ρ ≤ 1 is immediate.

(b)  If you like working with vectors, then it's easy to show that ρ is the cosine of the angle between two vectors in the X-Y plane. As cos(θ) is bounded below by -1 and above by +1 for any θ, we have our result for the range of ρ right away. See this post by Pat Ballew for access to the proof.

(c)  However, what about a proof that requires even less background knowledge? Suppose that you're a student who knows how to solve for the roots of a quadratic equation, and who knows a couple of basic results relating to variances. Then, proving that  -1 ≤ ρ ≤ 1 is still straightforward:

Let Z = X + tY, for any scalar, t. Note that var.(Z) = t2var.(Y) +2tcov.(X,Y) + var.(X) ≥ 0.

Or, using obvious notation, at2 + bt + c ≥ 0

This implies that the quadratic must have either one real root or no real roots, and this in turn implies that b2 - 4ac ≤ 0.

Recalling that a = var.(Y); b = 2cov.(X,Y); and c = var.(X), some simple re-arrangement of the last inequality yields the result that  -1 ≤ ρ ≤ 1.

A complete version of this proof is provided by David Darmonhere.


© 2015, David E. Giles