Importance of Cochran’s Theorem
Cochran’s theorem tells us about the distributions of partitioned sums of squares of normally distributed random variables. Traditional linear regression analysis relies upon making statistical claims about the distribution of sums of squares of normally distributed random variables In the simple normal regression model:
.
Where does this come from?
- Establish the fact that the multivariate Gaussian sum of squares is distributed.
- Provide intuition for Cochran’s theorem.
- Prove a lemma in support of Cochran’s theorem.
- Prove Cochran’s theorem.
- Connect Cochran’s theorem back to matrix linear regression.
Theorem 1 for : Supposeare i.i.d. N(0, 1), we have .
Proof:
. If are i.i.d. random variables with the MGF .
MGF for is.
MGF fully characterize the distribution, and the MGF for is .
- Quadratic forms of normal random variables are important in many branches of statistics: Least Squares, ANOVA, Regression Analysis.
- General idea: Split the sum of the squares of observations into a of quadratic forms where each corresponds to some cause of variation.
- The conclusion of Cochran’s theorem is that, under the assumption of normality, the various quadratic forms are independent and distributed. This fact is the foundation upon which many statistical tests rest.
Preliminaries: A Common Quadratic Form
Let X ∼ N(µ, Λ). Consider the quadratic form that appears in the exponent of the normal density . In the special case of µ = 0 and Λ = I, this reduces to X′X which by what we just proved we know is distributed. Let’s prove it holds in the general case.
Lemma 1: Let X ∼ N(µ, Λ) , with |Λ| > 0 , then .
Cochran’s Theorem: Let be i.i.d. distributed random variables, and suppose that , where are positive semi-definite quadratic forms in . Set . If , then is independent, and .
X be a normal random vector. The components of X are independent iff they are uncorrelated. Let X ∼ N(µ, Λ), then Y = C′X ∼ N(C′µ, C′ΛC). We can find an orthogonal matrix C such that D = C′ΛC is a diagonal matrix. The components of Y will be independent and , where are the eigenvalues of Λ.
Lemma 2: Let be real numbers. Suppose that can be split into a sum of positive semi-definite quadratic forms, that is , where Qi=X'AiX with . If , then there exists an orthogonal matrix C such that, with X = CY, we have ; ; ......; .
Different quadratic forms contain different Y -variables and that the number of terms in each equals that rank, , of . The end up in different sums, we have to use this to prove the independence of the different quadratic forms. Just prove for k = 2 case, the general case can be obtained by induction.
Proof: For k = 2, we have . There exists an orthogonal matrix C such that , where D is a diagonal matrix with eigenvalues of .
Since , eigenvalues are positive and eigenvalues are 0. Suppose without loss of generality, the first eigenvalues are positive. Set X = CY, then we have X′X = Y′C′CY = Y′Y.
Therefore, ''. Then, rearranging the terms, ''. Since , we conclude that ; .
This lemma is about real numbers, not random variables. It says that can be split into a sum of positive semi-definite quadratic forms, then there is the orthogonal transformation X = CY such that each of the quadratic forms have nice properties: Eachappears in only one resulting sum of squares, which leads to the independence of the sum of squares.
Proof of Cochran’s Theorem:
Using the Lemma, can be written using , they are independent. Furthermore, . Other s are the same.
Applications:
Sample variance is independent from sample mean. Recall,
.
Rearrange the term and express in matrix format
; '''.
We know ' , and . As a results, , .
Calculate . First of all, we have
.
On the other hand, since , we have .
Therefore, we have .
Another proof, noticing is also idempotent and symmetric, therefore, .
ANOVA:
' ; ' ; ' .
Under the null hypothesis, when , .
From linear algebra: ~. Then we have: ~ .
As a byproduct, MSE = SSE/(n − p) is an unbiased estimator of variance, since the mean of is n-p.
We have ''=trace'''=.
Then, .
First, since we have H1 = 1 (This amounts to do a multiple linear regression with the response always equal to 1 and therefore, the fitted value is still 1 because we can just use the constant to perfectly fit the model), then it is straightforward to check thatis an idempotent and symmetric matrix. Then, we have .