Two sample z test: Applicability












1












$begingroup$


The following example exercise is taken from Statistics by Freedman




A geography test was given to a simple random sample of 250 high-school students
in a certain large school dishict. One question involved an outline map of Europe,
with the counhies identified only by number. The students were asked to pick out
Great Britain and France. As it turned out, 65.6% could find France, compared to
70.4% for Great Britain. 18 Is the difference statistically significant? Or can this be
determined from the information given?




The author says




Exercise 5 on p. 515 (the geography test)
is an example of when not to use the formulas. Each subject makes two responses,
by answering (i) the question on Great Britain, and (ii) the question on France.
Both responses are observed, because each subject answers both questions. And
the responses are correlated, because a geography whiz is likely to be able to
answer both questions correctly, while someone who does not pay attention to
maps is likely to get both of them wrong. By contrast, if you took two independent
samples-asking one group about France and the other about Great Britain-the
formula would be fine. (That would be an inefficient way to do the study.)




The author is talking about two sample z-test. And the formula he is talking about is



$$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$



I understand that the variables are co-related, so $CoVar(bar{X},bar{Y})$ should also be present in the formula.



What I don't understand is




By contrast, if you took two independent samples-asking one group about France and the other about Great Britain-the formula would be fine. (That would be an inefficient way to do the study.)





  1. Why we don't need to consider covariance in this case and only in case of single sample?


Geography whiz are going to be present in the second independent sample in the approximately same proportion as the first sample.





  1. The author says that example problem can be solved using more advanced mathematics if we have information about the perctanges of the following category



    1 1 found Great Britain and France on the map



    1 0 found Great Britain; could not find France



    0 1 could not find Great Britain; found France



    0 0 could not find either country




I would like to know what this advanced mathematics is.



Thanks.










share|cite|improve this question









$endgroup$








  • 1




    $begingroup$
    My guess is they have a $2 times 2$ contingency table in mind. Rows for France (Yes and No), Columns for GB (Yes and No). With enough subjects, the test statistic would have approx a chi-squared distribution with 1 degree of freedom. Alternatively, one might use a Fisher Exact test.
    $endgroup$
    – BruceET
    Dec 1 '18 at 8:48
















1












$begingroup$


The following example exercise is taken from Statistics by Freedman




A geography test was given to a simple random sample of 250 high-school students
in a certain large school dishict. One question involved an outline map of Europe,
with the counhies identified only by number. The students were asked to pick out
Great Britain and France. As it turned out, 65.6% could find France, compared to
70.4% for Great Britain. 18 Is the difference statistically significant? Or can this be
determined from the information given?




The author says




Exercise 5 on p. 515 (the geography test)
is an example of when not to use the formulas. Each subject makes two responses,
by answering (i) the question on Great Britain, and (ii) the question on France.
Both responses are observed, because each subject answers both questions. And
the responses are correlated, because a geography whiz is likely to be able to
answer both questions correctly, while someone who does not pay attention to
maps is likely to get both of them wrong. By contrast, if you took two independent
samples-asking one group about France and the other about Great Britain-the
formula would be fine. (That would be an inefficient way to do the study.)




The author is talking about two sample z-test. And the formula he is talking about is



$$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$



I understand that the variables are co-related, so $CoVar(bar{X},bar{Y})$ should also be present in the formula.



What I don't understand is




By contrast, if you took two independent samples-asking one group about France and the other about Great Britain-the formula would be fine. (That would be an inefficient way to do the study.)





  1. Why we don't need to consider covariance in this case and only in case of single sample?


Geography whiz are going to be present in the second independent sample in the approximately same proportion as the first sample.





  1. The author says that example problem can be solved using more advanced mathematics if we have information about the perctanges of the following category



    1 1 found Great Britain and France on the map



    1 0 found Great Britain; could not find France



    0 1 could not find Great Britain; found France



    0 0 could not find either country




I would like to know what this advanced mathematics is.



Thanks.










share|cite|improve this question









$endgroup$








  • 1




    $begingroup$
    My guess is they have a $2 times 2$ contingency table in mind. Rows for France (Yes and No), Columns for GB (Yes and No). With enough subjects, the test statistic would have approx a chi-squared distribution with 1 degree of freedom. Alternatively, one might use a Fisher Exact test.
    $endgroup$
    – BruceET
    Dec 1 '18 at 8:48














1












1








1





$begingroup$


The following example exercise is taken from Statistics by Freedman




A geography test was given to a simple random sample of 250 high-school students
in a certain large school dishict. One question involved an outline map of Europe,
with the counhies identified only by number. The students were asked to pick out
Great Britain and France. As it turned out, 65.6% could find France, compared to
70.4% for Great Britain. 18 Is the difference statistically significant? Or can this be
determined from the information given?




The author says




Exercise 5 on p. 515 (the geography test)
is an example of when not to use the formulas. Each subject makes two responses,
by answering (i) the question on Great Britain, and (ii) the question on France.
Both responses are observed, because each subject answers both questions. And
the responses are correlated, because a geography whiz is likely to be able to
answer both questions correctly, while someone who does not pay attention to
maps is likely to get both of them wrong. By contrast, if you took two independent
samples-asking one group about France and the other about Great Britain-the
formula would be fine. (That would be an inefficient way to do the study.)




The author is talking about two sample z-test. And the formula he is talking about is



$$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$



I understand that the variables are co-related, so $CoVar(bar{X},bar{Y})$ should also be present in the formula.



What I don't understand is




By contrast, if you took two independent samples-asking one group about France and the other about Great Britain-the formula would be fine. (That would be an inefficient way to do the study.)





  1. Why we don't need to consider covariance in this case and only in case of single sample?


Geography whiz are going to be present in the second independent sample in the approximately same proportion as the first sample.





  1. The author says that example problem can be solved using more advanced mathematics if we have information about the perctanges of the following category



    1 1 found Great Britain and France on the map



    1 0 found Great Britain; could not find France



    0 1 could not find Great Britain; found France



    0 0 could not find either country




I would like to know what this advanced mathematics is.



Thanks.










share|cite|improve this question









$endgroup$




The following example exercise is taken from Statistics by Freedman




A geography test was given to a simple random sample of 250 high-school students
in a certain large school dishict. One question involved an outline map of Europe,
with the counhies identified only by number. The students were asked to pick out
Great Britain and France. As it turned out, 65.6% could find France, compared to
70.4% for Great Britain. 18 Is the difference statistically significant? Or can this be
determined from the information given?




The author says




Exercise 5 on p. 515 (the geography test)
is an example of when not to use the formulas. Each subject makes two responses,
by answering (i) the question on Great Britain, and (ii) the question on France.
Both responses are observed, because each subject answers both questions. And
the responses are correlated, because a geography whiz is likely to be able to
answer both questions correctly, while someone who does not pay attention to
maps is likely to get both of them wrong. By contrast, if you took two independent
samples-asking one group about France and the other about Great Britain-the
formula would be fine. (That would be an inefficient way to do the study.)




The author is talking about two sample z-test. And the formula he is talking about is



$$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$



I understand that the variables are co-related, so $CoVar(bar{X},bar{Y})$ should also be present in the formula.



What I don't understand is




By contrast, if you took two independent samples-asking one group about France and the other about Great Britain-the formula would be fine. (That would be an inefficient way to do the study.)





  1. Why we don't need to consider covariance in this case and only in case of single sample?


Geography whiz are going to be present in the second independent sample in the approximately same proportion as the first sample.





  1. The author says that example problem can be solved using more advanced mathematics if we have information about the perctanges of the following category



    1 1 found Great Britain and France on the map



    1 0 found Great Britain; could not find France



    0 1 could not find Great Britain; found France



    0 0 could not find either country




I would like to know what this advanced mathematics is.



Thanks.







statistics hypothesis-testing






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Dec 1 '18 at 7:36









q126yq126y

232212




232212








  • 1




    $begingroup$
    My guess is they have a $2 times 2$ contingency table in mind. Rows for France (Yes and No), Columns for GB (Yes and No). With enough subjects, the test statistic would have approx a chi-squared distribution with 1 degree of freedom. Alternatively, one might use a Fisher Exact test.
    $endgroup$
    – BruceET
    Dec 1 '18 at 8:48














  • 1




    $begingroup$
    My guess is they have a $2 times 2$ contingency table in mind. Rows for France (Yes and No), Columns for GB (Yes and No). With enough subjects, the test statistic would have approx a chi-squared distribution with 1 degree of freedom. Alternatively, one might use a Fisher Exact test.
    $endgroup$
    – BruceET
    Dec 1 '18 at 8:48








1




1




$begingroup$
My guess is they have a $2 times 2$ contingency table in mind. Rows for France (Yes and No), Columns for GB (Yes and No). With enough subjects, the test statistic would have approx a chi-squared distribution with 1 degree of freedom. Alternatively, one might use a Fisher Exact test.
$endgroup$
– BruceET
Dec 1 '18 at 8:48




$begingroup$
My guess is they have a $2 times 2$ contingency table in mind. Rows for France (Yes and No), Columns for GB (Yes and No). With enough subjects, the test statistic would have approx a chi-squared distribution with 1 degree of freedom. Alternatively, one might use a Fisher Exact test.
$endgroup$
– BruceET
Dec 1 '18 at 8:48










1 Answer
1






active

oldest

votes


















2












$begingroup$

Here is Minitab output for fake data in such a table.
I did not try to match the percentages you give in your problem. The null hypothesis is that recognition of GB and of France are independent abilities. The small p-value indicates the null hypothesis is rejected.



Chi-Square Test for Association: France, GB 

Rows: France Columns: GB

Yes No All

Yes 43 21 64
33.58 30.42
2.640 2.915

No 10 27 37
19.42 17.58
4.566 5.042

All 53 48 101

Cell Contents: Count
Expected count
Contribution to Chi-square

Pearson Chi-Square = 15.163,
DF = 1, P-Value = 0.000


Computations:



The observed count for the upper-left cell
is $X_{11} = 43.$



The expected count for the upper-left cell
is $E_{11} = 64(53)/101 = 33.58.$



The contribution for that cell is
$(X_{11} - E_{11})^2/E_{11} = 2.64.$



The chi-squared statistic $15.163$ is the sum of
the 'contributions' from all four cells.



From the row with DF=1 in a printed table
of chi-squared distributions, you can see that the value $3.8415$ cuts 5% from the upper
tail of the distribution $mathsf{Chisq}(1),$
so that any value of the chi-squared statistic
above 3.8415 would lead you to believe that
identification of GB and identification of France are not independent abilities (at the 5% level of significance). The
chi-squared statistic here is $15.16 > 3.84.$



Perhaps you can find a more complete discussion of this kind
of test later in your text.



Addendum. Suppose my data are real. In these data, the 43 + 27 who got both countries right or neither, you have no info whether GB or France is easier to identify on a map. Of the other 31, who got exactly one country right, there are only 10 who got only GB wrong.



Those 10 are in the lower tail of $mathsf{Binom}(31, .5).$
That is, assuming
both countries are equally easy to identify, there is only probability 0.0354 < 5% that 10 or fewer get only GB wrong. I would hesitate to draw strong conclusions from only 31 useful responses, but there does seem to be evidence more people recognize GB than France on a map.
(That wouldn't be surprising, because many people know GB is
an island nation, and there aren't many big
islands on a map of Europe.)



In R:



pbinom(10, 31, .5)
[1] 0.03537777





share|cite|improve this answer











$endgroup$













  • $begingroup$
    Thanks Mr Bruce. Can you also throw some light on Question 1?
    $endgroup$
    – q126y
    Dec 1 '18 at 15:30










  • $begingroup$
    Don't understand what Q#1 is asking, but see addendum.
    $endgroup$
    – BruceET
    Dec 1 '18 at 17:59










  • $begingroup$
    The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
    $endgroup$
    – q126y
    Dec 4 '18 at 18:08








  • 1




    $begingroup$
    If you have two groups chosen independently at random, then there is no covariance to ignore.
    $endgroup$
    – BruceET
    Dec 4 '18 at 18:14










  • $begingroup$
    Ah, yes! Thanks.
    $endgroup$
    – q126y
    Dec 4 '18 at 18:37











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3021095%2ftwo-sample-z-test-applicability%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2












$begingroup$

Here is Minitab output for fake data in such a table.
I did not try to match the percentages you give in your problem. The null hypothesis is that recognition of GB and of France are independent abilities. The small p-value indicates the null hypothesis is rejected.



Chi-Square Test for Association: France, GB 

Rows: France Columns: GB

Yes No All

Yes 43 21 64
33.58 30.42
2.640 2.915

No 10 27 37
19.42 17.58
4.566 5.042

All 53 48 101

Cell Contents: Count
Expected count
Contribution to Chi-square

Pearson Chi-Square = 15.163,
DF = 1, P-Value = 0.000


Computations:



The observed count for the upper-left cell
is $X_{11} = 43.$



The expected count for the upper-left cell
is $E_{11} = 64(53)/101 = 33.58.$



The contribution for that cell is
$(X_{11} - E_{11})^2/E_{11} = 2.64.$



The chi-squared statistic $15.163$ is the sum of
the 'contributions' from all four cells.



From the row with DF=1 in a printed table
of chi-squared distributions, you can see that the value $3.8415$ cuts 5% from the upper
tail of the distribution $mathsf{Chisq}(1),$
so that any value of the chi-squared statistic
above 3.8415 would lead you to believe that
identification of GB and identification of France are not independent abilities (at the 5% level of significance). The
chi-squared statistic here is $15.16 > 3.84.$



Perhaps you can find a more complete discussion of this kind
of test later in your text.



Addendum. Suppose my data are real. In these data, the 43 + 27 who got both countries right or neither, you have no info whether GB or France is easier to identify on a map. Of the other 31, who got exactly one country right, there are only 10 who got only GB wrong.



Those 10 are in the lower tail of $mathsf{Binom}(31, .5).$
That is, assuming
both countries are equally easy to identify, there is only probability 0.0354 < 5% that 10 or fewer get only GB wrong. I would hesitate to draw strong conclusions from only 31 useful responses, but there does seem to be evidence more people recognize GB than France on a map.
(That wouldn't be surprising, because many people know GB is
an island nation, and there aren't many big
islands on a map of Europe.)



In R:



pbinom(10, 31, .5)
[1] 0.03537777





share|cite|improve this answer











$endgroup$













  • $begingroup$
    Thanks Mr Bruce. Can you also throw some light on Question 1?
    $endgroup$
    – q126y
    Dec 1 '18 at 15:30










  • $begingroup$
    Don't understand what Q#1 is asking, but see addendum.
    $endgroup$
    – BruceET
    Dec 1 '18 at 17:59










  • $begingroup$
    The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
    $endgroup$
    – q126y
    Dec 4 '18 at 18:08








  • 1




    $begingroup$
    If you have two groups chosen independently at random, then there is no covariance to ignore.
    $endgroup$
    – BruceET
    Dec 4 '18 at 18:14










  • $begingroup$
    Ah, yes! Thanks.
    $endgroup$
    – q126y
    Dec 4 '18 at 18:37
















2












$begingroup$

Here is Minitab output for fake data in such a table.
I did not try to match the percentages you give in your problem. The null hypothesis is that recognition of GB and of France are independent abilities. The small p-value indicates the null hypothesis is rejected.



Chi-Square Test for Association: France, GB 

Rows: France Columns: GB

Yes No All

Yes 43 21 64
33.58 30.42
2.640 2.915

No 10 27 37
19.42 17.58
4.566 5.042

All 53 48 101

Cell Contents: Count
Expected count
Contribution to Chi-square

Pearson Chi-Square = 15.163,
DF = 1, P-Value = 0.000


Computations:



The observed count for the upper-left cell
is $X_{11} = 43.$



The expected count for the upper-left cell
is $E_{11} = 64(53)/101 = 33.58.$



The contribution for that cell is
$(X_{11} - E_{11})^2/E_{11} = 2.64.$



The chi-squared statistic $15.163$ is the sum of
the 'contributions' from all four cells.



From the row with DF=1 in a printed table
of chi-squared distributions, you can see that the value $3.8415$ cuts 5% from the upper
tail of the distribution $mathsf{Chisq}(1),$
so that any value of the chi-squared statistic
above 3.8415 would lead you to believe that
identification of GB and identification of France are not independent abilities (at the 5% level of significance). The
chi-squared statistic here is $15.16 > 3.84.$



Perhaps you can find a more complete discussion of this kind
of test later in your text.



Addendum. Suppose my data are real. In these data, the 43 + 27 who got both countries right or neither, you have no info whether GB or France is easier to identify on a map. Of the other 31, who got exactly one country right, there are only 10 who got only GB wrong.



Those 10 are in the lower tail of $mathsf{Binom}(31, .5).$
That is, assuming
both countries are equally easy to identify, there is only probability 0.0354 < 5% that 10 or fewer get only GB wrong. I would hesitate to draw strong conclusions from only 31 useful responses, but there does seem to be evidence more people recognize GB than France on a map.
(That wouldn't be surprising, because many people know GB is
an island nation, and there aren't many big
islands on a map of Europe.)



In R:



pbinom(10, 31, .5)
[1] 0.03537777





share|cite|improve this answer











$endgroup$













  • $begingroup$
    Thanks Mr Bruce. Can you also throw some light on Question 1?
    $endgroup$
    – q126y
    Dec 1 '18 at 15:30










  • $begingroup$
    Don't understand what Q#1 is asking, but see addendum.
    $endgroup$
    – BruceET
    Dec 1 '18 at 17:59










  • $begingroup$
    The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
    $endgroup$
    – q126y
    Dec 4 '18 at 18:08








  • 1




    $begingroup$
    If you have two groups chosen independently at random, then there is no covariance to ignore.
    $endgroup$
    – BruceET
    Dec 4 '18 at 18:14










  • $begingroup$
    Ah, yes! Thanks.
    $endgroup$
    – q126y
    Dec 4 '18 at 18:37














2












2








2





$begingroup$

Here is Minitab output for fake data in such a table.
I did not try to match the percentages you give in your problem. The null hypothesis is that recognition of GB and of France are independent abilities. The small p-value indicates the null hypothesis is rejected.



Chi-Square Test for Association: France, GB 

Rows: France Columns: GB

Yes No All

Yes 43 21 64
33.58 30.42
2.640 2.915

No 10 27 37
19.42 17.58
4.566 5.042

All 53 48 101

Cell Contents: Count
Expected count
Contribution to Chi-square

Pearson Chi-Square = 15.163,
DF = 1, P-Value = 0.000


Computations:



The observed count for the upper-left cell
is $X_{11} = 43.$



The expected count for the upper-left cell
is $E_{11} = 64(53)/101 = 33.58.$



The contribution for that cell is
$(X_{11} - E_{11})^2/E_{11} = 2.64.$



The chi-squared statistic $15.163$ is the sum of
the 'contributions' from all four cells.



From the row with DF=1 in a printed table
of chi-squared distributions, you can see that the value $3.8415$ cuts 5% from the upper
tail of the distribution $mathsf{Chisq}(1),$
so that any value of the chi-squared statistic
above 3.8415 would lead you to believe that
identification of GB and identification of France are not independent abilities (at the 5% level of significance). The
chi-squared statistic here is $15.16 > 3.84.$



Perhaps you can find a more complete discussion of this kind
of test later in your text.



Addendum. Suppose my data are real. In these data, the 43 + 27 who got both countries right or neither, you have no info whether GB or France is easier to identify on a map. Of the other 31, who got exactly one country right, there are only 10 who got only GB wrong.



Those 10 are in the lower tail of $mathsf{Binom}(31, .5).$
That is, assuming
both countries are equally easy to identify, there is only probability 0.0354 < 5% that 10 or fewer get only GB wrong. I would hesitate to draw strong conclusions from only 31 useful responses, but there does seem to be evidence more people recognize GB than France on a map.
(That wouldn't be surprising, because many people know GB is
an island nation, and there aren't many big
islands on a map of Europe.)



In R:



pbinom(10, 31, .5)
[1] 0.03537777





share|cite|improve this answer











$endgroup$



Here is Minitab output for fake data in such a table.
I did not try to match the percentages you give in your problem. The null hypothesis is that recognition of GB and of France are independent abilities. The small p-value indicates the null hypothesis is rejected.



Chi-Square Test for Association: France, GB 

Rows: France Columns: GB

Yes No All

Yes 43 21 64
33.58 30.42
2.640 2.915

No 10 27 37
19.42 17.58
4.566 5.042

All 53 48 101

Cell Contents: Count
Expected count
Contribution to Chi-square

Pearson Chi-Square = 15.163,
DF = 1, P-Value = 0.000


Computations:



The observed count for the upper-left cell
is $X_{11} = 43.$



The expected count for the upper-left cell
is $E_{11} = 64(53)/101 = 33.58.$



The contribution for that cell is
$(X_{11} - E_{11})^2/E_{11} = 2.64.$



The chi-squared statistic $15.163$ is the sum of
the 'contributions' from all four cells.



From the row with DF=1 in a printed table
of chi-squared distributions, you can see that the value $3.8415$ cuts 5% from the upper
tail of the distribution $mathsf{Chisq}(1),$
so that any value of the chi-squared statistic
above 3.8415 would lead you to believe that
identification of GB and identification of France are not independent abilities (at the 5% level of significance). The
chi-squared statistic here is $15.16 > 3.84.$



Perhaps you can find a more complete discussion of this kind
of test later in your text.



Addendum. Suppose my data are real. In these data, the 43 + 27 who got both countries right or neither, you have no info whether GB or France is easier to identify on a map. Of the other 31, who got exactly one country right, there are only 10 who got only GB wrong.



Those 10 are in the lower tail of $mathsf{Binom}(31, .5).$
That is, assuming
both countries are equally easy to identify, there is only probability 0.0354 < 5% that 10 or fewer get only GB wrong. I would hesitate to draw strong conclusions from only 31 useful responses, but there does seem to be evidence more people recognize GB than France on a map.
(That wouldn't be surprising, because many people know GB is
an island nation, and there aren't many big
islands on a map of Europe.)



In R:



pbinom(10, 31, .5)
[1] 0.03537777






share|cite|improve this answer














share|cite|improve this answer



share|cite|improve this answer








edited Dec 1 '18 at 18:18

























answered Dec 1 '18 at 9:00









BruceETBruceET

35.2k71440




35.2k71440












  • $begingroup$
    Thanks Mr Bruce. Can you also throw some light on Question 1?
    $endgroup$
    – q126y
    Dec 1 '18 at 15:30










  • $begingroup$
    Don't understand what Q#1 is asking, but see addendum.
    $endgroup$
    – BruceET
    Dec 1 '18 at 17:59










  • $begingroup$
    The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
    $endgroup$
    – q126y
    Dec 4 '18 at 18:08








  • 1




    $begingroup$
    If you have two groups chosen independently at random, then there is no covariance to ignore.
    $endgroup$
    – BruceET
    Dec 4 '18 at 18:14










  • $begingroup$
    Ah, yes! Thanks.
    $endgroup$
    – q126y
    Dec 4 '18 at 18:37


















  • $begingroup$
    Thanks Mr Bruce. Can you also throw some light on Question 1?
    $endgroup$
    – q126y
    Dec 1 '18 at 15:30










  • $begingroup$
    Don't understand what Q#1 is asking, but see addendum.
    $endgroup$
    – BruceET
    Dec 1 '18 at 17:59










  • $begingroup$
    The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
    $endgroup$
    – q126y
    Dec 4 '18 at 18:08








  • 1




    $begingroup$
    If you have two groups chosen independently at random, then there is no covariance to ignore.
    $endgroup$
    – BruceET
    Dec 4 '18 at 18:14










  • $begingroup$
    Ah, yes! Thanks.
    $endgroup$
    – q126y
    Dec 4 '18 at 18:37
















$begingroup$
Thanks Mr Bruce. Can you also throw some light on Question 1?
$endgroup$
– q126y
Dec 1 '18 at 15:30




$begingroup$
Thanks Mr Bruce. Can you also throw some light on Question 1?
$endgroup$
– q126y
Dec 1 '18 at 15:30












$begingroup$
Don't understand what Q#1 is asking, but see addendum.
$endgroup$
– BruceET
Dec 1 '18 at 17:59




$begingroup$
Don't understand what Q#1 is asking, but see addendum.
$endgroup$
– BruceET
Dec 1 '18 at 17:59












$begingroup$
The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
$endgroup$
– q126y
Dec 4 '18 at 18:08






$begingroup$
The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
$endgroup$
– q126y
Dec 4 '18 at 18:08






1




1




$begingroup$
If you have two groups chosen independently at random, then there is no covariance to ignore.
$endgroup$
– BruceET
Dec 4 '18 at 18:14




$begingroup$
If you have two groups chosen independently at random, then there is no covariance to ignore.
$endgroup$
– BruceET
Dec 4 '18 at 18:14












$begingroup$
Ah, yes! Thanks.
$endgroup$
– q126y
Dec 4 '18 at 18:37




$begingroup$
Ah, yes! Thanks.
$endgroup$
– q126y
Dec 4 '18 at 18:37


















draft saved

draft discarded




















































Thanks for contributing an answer to Mathematics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3021095%2ftwo-sample-z-test-applicability%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Quarter-circle Tiles

build a pushdown automaton that recognizes the reverse language of a given pushdown automaton?

Mont Emei