How would a composite variable be strongly correlated with one variable but not the other?
up vote
2
down vote
favorite
I have two variables x1 and x2 which measure relatively similar things (r ~ 0.6), with x2 slightly larger than x1 on average. I then created a new variable x3 by subtracting the two: x3 = x1 - x2.
However, when I ran the Pearson correlations, x3 is strongly negatively correlated with x2 as expected (r ~ -0.6), but x3 is not very correlated with x1 (r ~ 0.1). How is this possible?
correlation
add a comment |
up vote
2
down vote
favorite
I have two variables x1 and x2 which measure relatively similar things (r ~ 0.6), with x2 slightly larger than x1 on average. I then created a new variable x3 by subtracting the two: x3 = x1 - x2.
However, when I ran the Pearson correlations, x3 is strongly negatively correlated with x2 as expected (r ~ -0.6), but x3 is not very correlated with x1 (r ~ 0.1). How is this possible?
correlation
2
A scatter plot matrix should help.
– Nick Cox
14 hours ago
1
Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
14 hours ago
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I have two variables x1 and x2 which measure relatively similar things (r ~ 0.6), with x2 slightly larger than x1 on average. I then created a new variable x3 by subtracting the two: x3 = x1 - x2.
However, when I ran the Pearson correlations, x3 is strongly negatively correlated with x2 as expected (r ~ -0.6), but x3 is not very correlated with x1 (r ~ 0.1). How is this possible?
correlation
I have two variables x1 and x2 which measure relatively similar things (r ~ 0.6), with x2 slightly larger than x1 on average. I then created a new variable x3 by subtracting the two: x3 = x1 - x2.
However, when I ran the Pearson correlations, x3 is strongly negatively correlated with x2 as expected (r ~ -0.6), but x3 is not very correlated with x1 (r ~ 0.1). How is this possible?
correlation
correlation
edited 14 hours ago
Nick Cox
37.9k480127
37.9k480127
asked 20 hours ago
hlinee
387
387
2
A scatter plot matrix should help.
– Nick Cox
14 hours ago
1
Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
14 hours ago
add a comment |
2
A scatter plot matrix should help.
– Nick Cox
14 hours ago
1
Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
14 hours ago
2
2
A scatter plot matrix should help.
– Nick Cox
14 hours ago
A scatter plot matrix should help.
– Nick Cox
14 hours ago
1
1
Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
14 hours ago
Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
14 hours ago
add a comment |
4 Answers
4
active
oldest
votes
up vote
13
down vote
Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.
add a comment |
up vote
2
down vote
This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.
You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.
New contributor
By "covariation", do you mean "covariance"?
– Kodiologist
18 hours ago
add a comment |
up vote
1
down vote
You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.
add a comment |
up vote
1
down vote
Let $Var(X_1) = sigma_1^2$, $Var(X_2) = sigma_2^2$, and $Cov(X_1,X_2)=sigma_{12} = rhosigma_1sigma_2$
Then $Var(X_3=X_1-X_2)=sigma_1^2+sigma_2^2 - 2sigma_{12}$
$Cov(X_1,X_3)=sigma_1^2-sigma_{12}$
$Cov(X_2,X_3) =sigma_{12}-sigma_2^2$
$Corr(X_1,X_3) =frac{sigma_1^2-sigma_{12}}{sqrt{sigma_1^2(sigma_1^2+sigma_2^2 - 2sigma_{12})}}$
$Corr(X_2,X_3) =frac{-sigma_2^2+sigma_{12}}{sqrt{sigma_2^2(sigma_1^2+sigma_2^2 - 2sigma_{12})}}$
So $|Corr(X_1,X_3)| lt text {or} = text {or} gt |Corr(X_2,X_3)|$ depends on $sigma_1^2$ and $sigma_2^2$
This relation cannot be determined by correlation coefficient.
add a comment |
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
13
down vote
Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.
add a comment |
up vote
13
down vote
Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.
add a comment |
up vote
13
down vote
up vote
13
down vote
Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.
Here's a simple example. Suppose $ε_1$ and $ε_2$ are independent standard normal random variables. Define $X_1 = ε_1$, $X_2 = X_1 + ε_2$, and $X_3 = X_1 - X_2$. The correlation of $X_1$ with $X_2$ is then $tfrac{1}{sqrt{2}} approx .71$. Likewise, the correlation of $X_2$ with $X_3$ is $-tfrac{1}{sqrt{2}}$. But the correlation of $X_1$ with $X_3$ is the correlation of $ε_1$ with $ε_1 - (ε_1 + ε_2) = -ε_2$, which is 0 since the $ε_i$s are independent.
edited 18 hours ago
answered 19 hours ago
Kodiologist
16.5k22953
16.5k22953
add a comment |
add a comment |
up vote
2
down vote
This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.
You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.
New contributor
By "covariation", do you mean "covariance"?
– Kodiologist
18 hours ago
add a comment |
up vote
2
down vote
This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.
You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.
New contributor
By "covariation", do you mean "covariance"?
– Kodiologist
18 hours ago
add a comment |
up vote
2
down vote
up vote
2
down vote
This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.
You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.
New contributor
This is by construction of $x_3$. Given that $x_2$ and $x_1$ are closely related - in terms of their Pearson correlation if you subtract one from the other, you reduce correlation. The best way to see that is to consider the extreme scenario of complete correlation, i.e., $x_2=x_1$, in which case $x_3=x_1-x_2=0$, which is fully deterministic, i.e., $rapprox 0$.
You can do a more formal argument using the definition of the Pearson correlation by looking at the covariation between $x_3$ and $x_1$. You will see that the covariation will be reduced. By how much, depends on the correlation between $x_1$ and $x_2$, i.e., $r_{12}$ and their standard deviations. Everything being equal, the larger $r_{12}$, the smaller $r_{13}$.
New contributor
New contributor
answered 19 hours ago
Gkhan Cebs
311
311
New contributor
New contributor
By "covariation", do you mean "covariance"?
– Kodiologist
18 hours ago
add a comment |
By "covariation", do you mean "covariance"?
– Kodiologist
18 hours ago
By "covariation", do you mean "covariance"?
– Kodiologist
18 hours ago
By "covariation", do you mean "covariance"?
– Kodiologist
18 hours ago
add a comment |
up vote
1
down vote
You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.
add a comment |
up vote
1
down vote
You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.
add a comment |
up vote
1
down vote
up vote
1
down vote
You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.
You can rewrite your equation $x_3=x_2-x_1$ as $x_2=x_3-x_1$. Then regardless of what you pick as $x_1$ and $x_3$, you will have that $x_2$ is correlated to $x_1$ and $x_3$, but there is no reason to expect $x_1$ and $x_3$ to be correlated to each other. For instance, if $x_1$= number of letters in title of Best Picture Oscar winner, $x_3$= number of named hurricanes, $x_2$= number of named hurricanes - number of letters in title of Best Picture Oscar winner, then you will have that $x_3=x_2-x_1$, but that doesn't mean that $x_3$ will be correlated with $x_1$.
answered 15 hours ago
Acccumulation
1,52826
1,52826
add a comment |
add a comment |
up vote
1
down vote
Let $Var(X_1) = sigma_1^2$, $Var(X_2) = sigma_2^2$, and $Cov(X_1,X_2)=sigma_{12} = rhosigma_1sigma_2$
Then $Var(X_3=X_1-X_2)=sigma_1^2+sigma_2^2 - 2sigma_{12}$
$Cov(X_1,X_3)=sigma_1^2-sigma_{12}$
$Cov(X_2,X_3) =sigma_{12}-sigma_2^2$
$Corr(X_1,X_3) =frac{sigma_1^2-sigma_{12}}{sqrt{sigma_1^2(sigma_1^2+sigma_2^2 - 2sigma_{12})}}$
$Corr(X_2,X_3) =frac{-sigma_2^2+sigma_{12}}{sqrt{sigma_2^2(sigma_1^2+sigma_2^2 - 2sigma_{12})}}$
So $|Corr(X_1,X_3)| lt text {or} = text {or} gt |Corr(X_2,X_3)|$ depends on $sigma_1^2$ and $sigma_2^2$
This relation cannot be determined by correlation coefficient.
add a comment |
up vote
1
down vote
Let $Var(X_1) = sigma_1^2$, $Var(X_2) = sigma_2^2$, and $Cov(X_1,X_2)=sigma_{12} = rhosigma_1sigma_2$
Then $Var(X_3=X_1-X_2)=sigma_1^2+sigma_2^2 - 2sigma_{12}$
$Cov(X_1,X_3)=sigma_1^2-sigma_{12}$
$Cov(X_2,X_3) =sigma_{12}-sigma_2^2$
$Corr(X_1,X_3) =frac{sigma_1^2-sigma_{12}}{sqrt{sigma_1^2(sigma_1^2+sigma_2^2 - 2sigma_{12})}}$
$Corr(X_2,X_3) =frac{-sigma_2^2+sigma_{12}}{sqrt{sigma_2^2(sigma_1^2+sigma_2^2 - 2sigma_{12})}}$
So $|Corr(X_1,X_3)| lt text {or} = text {or} gt |Corr(X_2,X_3)|$ depends on $sigma_1^2$ and $sigma_2^2$
This relation cannot be determined by correlation coefficient.
add a comment |
up vote
1
down vote
up vote
1
down vote
Let $Var(X_1) = sigma_1^2$, $Var(X_2) = sigma_2^2$, and $Cov(X_1,X_2)=sigma_{12} = rhosigma_1sigma_2$
Then $Var(X_3=X_1-X_2)=sigma_1^2+sigma_2^2 - 2sigma_{12}$
$Cov(X_1,X_3)=sigma_1^2-sigma_{12}$
$Cov(X_2,X_3) =sigma_{12}-sigma_2^2$
$Corr(X_1,X_3) =frac{sigma_1^2-sigma_{12}}{sqrt{sigma_1^2(sigma_1^2+sigma_2^2 - 2sigma_{12})}}$
$Corr(X_2,X_3) =frac{-sigma_2^2+sigma_{12}}{sqrt{sigma_2^2(sigma_1^2+sigma_2^2 - 2sigma_{12})}}$
So $|Corr(X_1,X_3)| lt text {or} = text {or} gt |Corr(X_2,X_3)|$ depends on $sigma_1^2$ and $sigma_2^2$
This relation cannot be determined by correlation coefficient.
Let $Var(X_1) = sigma_1^2$, $Var(X_2) = sigma_2^2$, and $Cov(X_1,X_2)=sigma_{12} = rhosigma_1sigma_2$
Then $Var(X_3=X_1-X_2)=sigma_1^2+sigma_2^2 - 2sigma_{12}$
$Cov(X_1,X_3)=sigma_1^2-sigma_{12}$
$Cov(X_2,X_3) =sigma_{12}-sigma_2^2$
$Corr(X_1,X_3) =frac{sigma_1^2-sigma_{12}}{sqrt{sigma_1^2(sigma_1^2+sigma_2^2 - 2sigma_{12})}}$
$Corr(X_2,X_3) =frac{-sigma_2^2+sigma_{12}}{sqrt{sigma_2^2(sigma_1^2+sigma_2^2 - 2sigma_{12})}}$
So $|Corr(X_1,X_3)| lt text {or} = text {or} gt |Corr(X_2,X_3)|$ depends on $sigma_1^2$ and $sigma_2^2$
This relation cannot be determined by correlation coefficient.
edited 9 hours ago
answered 11 hours ago
user158565
4,9501317
4,9501317
add a comment |
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f381477%2fhow-would-a-composite-variable-be-strongly-correlated-with-one-variable-but-not%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
A scatter plot matrix should help.
– Nick Cox
14 hours ago
1
Possible duplicate of When A and B are positively related variables, can they have opposite effect on their outcome variable C?
– sds
14 hours ago