Monday, December 28, 2015

Correlation Isn't Necessarily Transitive

If X is correlated with Y, and Y is correlated with Z, does it follow that X and Z are correlated?

No, not necessarily. That is, the relationship of correlation isn't necessarily transitive.

In a blog post from last year the Fields Medallist, Terrence Tao, discusses the question: "When is Correlation Transitive?", and provides a thorough mathematical answer.

He also provides this simple example of correlation intransitivity: 

This is something for students of econometrics to keep in mind!

© 2015, David E. Giles


  1. nice post

    a more general example: A, B are iid and C = A + B, then A ~ C ~ B but A !~ B

    i first became aware of this when i learnt about Instrumental Variables in econometrics class

    an instrument Z is valid if it is correlated with X but not with e in the regression

    Y = XB + e

    which is to say that Z ~ X ~ e but Z !~ e, a violation of transitivity

  2. I do not quite get the three unit vectors example. v is a constant, so it should be uncorrelated with anything. The pair u and w constitutes a basic example of perfect negative correlation. (R software shows that corr(u,w)=-1, as expected, while corr(u,v) and corr(v,w) are undefined.)

    1. The correlation is the dot product of the vectors - see this post:

    2. I am sorry, but I also can't understand this example:
      Pearson's product-moment correlation coefficient for sample data:
      corr(x,y)= Σ[(Xi - X*)(Yi - Y*)] / {[Σ(Xi - X*)2][Σ(Yi - Y*)2]}1/2

      Then corr(u,v)=[(1-0.5)*{1/sqrt(2)-1/sqrt(2)}+(0-0.5)*{1/sqrt(2)-1/sqrt(2)}]/
      / [{(1-0.5)^2+(0-0.5)^2}*{{1/sqrt(2)-1/sqrt(2)}^2+{1/sqrt(2)-1/sqrt(2)}^2}]^(1/2) = 0
      corr(v, w)=0
      corr(u,w)=[(1-0.5)*(0-0.5)+(0-0.5)*(1-0.5)] /
      / [{(1-0.5)^2+(0-0.5)^2}*{(0-0.5)^2+(1-0.5)^2}]^(1/2)=
      This result seems obvious to me. So, there is no correlation between u and v and between v and w, but corr(u,w)=-1.

  3. Given a correlation between X and Y, and between Y and Z, it is interesting to compute the limits of correlation between X and Z (say, by computing the covariance matrix and requiring it to be semi-definite). The implications are quite stunning.

    For example, if cor(X,Y) = 0.5 and cor(Y,Z) = 0.5, the minimum of cor(X,Z) is -0.5 (with a negative sign; the maximum is, of course, 1)! This should come as no surprise, since the correlation is, by its properties, and Euclidian distance, so the transitivity need not apply.

    BTW, this actually happens quite often in the world of commodity prices, although not quite so drastically.

  4. Interesting, especially since causality is transitive -- causality behaves as a peculiar case of logical implication (i.e., a cause is a sufficient, yet not always necessary condition for its effect).

    I wonder if we could use it to argue about causality somehow.