Scaling normal distributions inconsistency












0












$begingroup$


Scaling Normal distributions does not seem to be a problem. The variability can be derived as follows:
VAR[y] = VAR[sX] = s^2 * VAR[X] = s^2 * sigma_squared



I understand up until this point, but the implications seem counterintuitive to me.
Suppose we have a normally distributed variable. Next, I halve the distribution and sum up two of the resulting distributions. Doing this, I expect to arrive at the original distribution, but I do not.




  • VAR[Y] = VAR[0.5 X] = O.25 * sigma_squared

  • VAR[Y] + VAR[Y] = 0.25 * sigma_squared + 0.25 * sigma_squared = 0.5 * sigma_squared


A concrete example that I would like to apply this to is the following.
I have the distribution of people being allowed into an amusement park on weekends. It is normally distributed with average AVG and variance sigma_sq. Assuming that the distributions of people being allowed in on both days are the same, I derive that they follow the following distributions:




  • AVG_SATURDAY = AVG_SUNDAY = 0.5 * AVG

  • VAR_SATURDAY = VAR_SUNDAY = 0.25 * sigma_sq


Now, suppose that I had different measurements. I did not measure the total amount of people being allowed in on weekends, but I measured the total amount of people allowed in on Saturdays and Sundays separately. If I wanted to know the distribution of people being allowed in on weekends, I would sum up the distributions:




  • AVG_WEEKEND = AVG_SATURDAY + AVERAGE_SUNDAY

  • VAR_WEEKEND = VAR_SATURDAY + VAR_SUNDAY


As a result, I have conflicting variances for the same distribution which I measured in two different ways.
What am I doing that is not allowed?










share|cite|improve this question









$endgroup$












  • $begingroup$
    You have to consider the correlation between variables if you sum them sum up.
    $endgroup$
    – Karl
    Dec 14 '18 at 17:47










  • $begingroup$
    Yes, but let us assume that there is no correlation between the two in this case.
    $endgroup$
    – Brawling_Bear
    Dec 16 '18 at 9:37










  • $begingroup$
    @ Brawling_Bear if you have Y/2 and Y/2 and add them, there IS correlation between them, i.e. $corr(Y/2,Y/2)=1$.
    $endgroup$
    – Karl
    Dec 16 '18 at 13:16












  • $begingroup$
    Hmm, yes. I can see that. Thank you. By scaling the distribution twice, I've created two identidical variables, and thus they are perfectly correlated. Consequently, adding the distributions up yields the original variance. I take it then, that if I assume the distributions on Saturday and Sunday are identical, both will have a variance of 0.5 times the original one.
    $endgroup$
    – Brawling_Bear
    Dec 16 '18 at 16:54
















0












$begingroup$


Scaling Normal distributions does not seem to be a problem. The variability can be derived as follows:
VAR[y] = VAR[sX] = s^2 * VAR[X] = s^2 * sigma_squared



I understand up until this point, but the implications seem counterintuitive to me.
Suppose we have a normally distributed variable. Next, I halve the distribution and sum up two of the resulting distributions. Doing this, I expect to arrive at the original distribution, but I do not.




  • VAR[Y] = VAR[0.5 X] = O.25 * sigma_squared

  • VAR[Y] + VAR[Y] = 0.25 * sigma_squared + 0.25 * sigma_squared = 0.5 * sigma_squared


A concrete example that I would like to apply this to is the following.
I have the distribution of people being allowed into an amusement park on weekends. It is normally distributed with average AVG and variance sigma_sq. Assuming that the distributions of people being allowed in on both days are the same, I derive that they follow the following distributions:




  • AVG_SATURDAY = AVG_SUNDAY = 0.5 * AVG

  • VAR_SATURDAY = VAR_SUNDAY = 0.25 * sigma_sq


Now, suppose that I had different measurements. I did not measure the total amount of people being allowed in on weekends, but I measured the total amount of people allowed in on Saturdays and Sundays separately. If I wanted to know the distribution of people being allowed in on weekends, I would sum up the distributions:




  • AVG_WEEKEND = AVG_SATURDAY + AVERAGE_SUNDAY

  • VAR_WEEKEND = VAR_SATURDAY + VAR_SUNDAY


As a result, I have conflicting variances for the same distribution which I measured in two different ways.
What am I doing that is not allowed?










share|cite|improve this question









$endgroup$












  • $begingroup$
    You have to consider the correlation between variables if you sum them sum up.
    $endgroup$
    – Karl
    Dec 14 '18 at 17:47










  • $begingroup$
    Yes, but let us assume that there is no correlation between the two in this case.
    $endgroup$
    – Brawling_Bear
    Dec 16 '18 at 9:37










  • $begingroup$
    @ Brawling_Bear if you have Y/2 and Y/2 and add them, there IS correlation between them, i.e. $corr(Y/2,Y/2)=1$.
    $endgroup$
    – Karl
    Dec 16 '18 at 13:16












  • $begingroup$
    Hmm, yes. I can see that. Thank you. By scaling the distribution twice, I've created two identidical variables, and thus they are perfectly correlated. Consequently, adding the distributions up yields the original variance. I take it then, that if I assume the distributions on Saturday and Sunday are identical, both will have a variance of 0.5 times the original one.
    $endgroup$
    – Brawling_Bear
    Dec 16 '18 at 16:54














0












0








0





$begingroup$


Scaling Normal distributions does not seem to be a problem. The variability can be derived as follows:
VAR[y] = VAR[sX] = s^2 * VAR[X] = s^2 * sigma_squared



I understand up until this point, but the implications seem counterintuitive to me.
Suppose we have a normally distributed variable. Next, I halve the distribution and sum up two of the resulting distributions. Doing this, I expect to arrive at the original distribution, but I do not.




  • VAR[Y] = VAR[0.5 X] = O.25 * sigma_squared

  • VAR[Y] + VAR[Y] = 0.25 * sigma_squared + 0.25 * sigma_squared = 0.5 * sigma_squared


A concrete example that I would like to apply this to is the following.
I have the distribution of people being allowed into an amusement park on weekends. It is normally distributed with average AVG and variance sigma_sq. Assuming that the distributions of people being allowed in on both days are the same, I derive that they follow the following distributions:




  • AVG_SATURDAY = AVG_SUNDAY = 0.5 * AVG

  • VAR_SATURDAY = VAR_SUNDAY = 0.25 * sigma_sq


Now, suppose that I had different measurements. I did not measure the total amount of people being allowed in on weekends, but I measured the total amount of people allowed in on Saturdays and Sundays separately. If I wanted to know the distribution of people being allowed in on weekends, I would sum up the distributions:




  • AVG_WEEKEND = AVG_SATURDAY + AVERAGE_SUNDAY

  • VAR_WEEKEND = VAR_SATURDAY + VAR_SUNDAY


As a result, I have conflicting variances for the same distribution which I measured in two different ways.
What am I doing that is not allowed?










share|cite|improve this question









$endgroup$




Scaling Normal distributions does not seem to be a problem. The variability can be derived as follows:
VAR[y] = VAR[sX] = s^2 * VAR[X] = s^2 * sigma_squared



I understand up until this point, but the implications seem counterintuitive to me.
Suppose we have a normally distributed variable. Next, I halve the distribution and sum up two of the resulting distributions. Doing this, I expect to arrive at the original distribution, but I do not.




  • VAR[Y] = VAR[0.5 X] = O.25 * sigma_squared

  • VAR[Y] + VAR[Y] = 0.25 * sigma_squared + 0.25 * sigma_squared = 0.5 * sigma_squared


A concrete example that I would like to apply this to is the following.
I have the distribution of people being allowed into an amusement park on weekends. It is normally distributed with average AVG and variance sigma_sq. Assuming that the distributions of people being allowed in on both days are the same, I derive that they follow the following distributions:




  • AVG_SATURDAY = AVG_SUNDAY = 0.5 * AVG

  • VAR_SATURDAY = VAR_SUNDAY = 0.25 * sigma_sq


Now, suppose that I had different measurements. I did not measure the total amount of people being allowed in on weekends, but I measured the total amount of people allowed in on Saturdays and Sundays separately. If I wanted to know the distribution of people being allowed in on weekends, I would sum up the distributions:




  • AVG_WEEKEND = AVG_SATURDAY + AVERAGE_SUNDAY

  • VAR_WEEKEND = VAR_SATURDAY + VAR_SUNDAY


As a result, I have conflicting variances for the same distribution which I measured in two different ways.
What am I doing that is not allowed?







normal-distribution






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Dec 14 '18 at 17:34









Brawling_BearBrawling_Bear

1




1












  • $begingroup$
    You have to consider the correlation between variables if you sum them sum up.
    $endgroup$
    – Karl
    Dec 14 '18 at 17:47










  • $begingroup$
    Yes, but let us assume that there is no correlation between the two in this case.
    $endgroup$
    – Brawling_Bear
    Dec 16 '18 at 9:37










  • $begingroup$
    @ Brawling_Bear if you have Y/2 and Y/2 and add them, there IS correlation between them, i.e. $corr(Y/2,Y/2)=1$.
    $endgroup$
    – Karl
    Dec 16 '18 at 13:16












  • $begingroup$
    Hmm, yes. I can see that. Thank you. By scaling the distribution twice, I've created two identidical variables, and thus they are perfectly correlated. Consequently, adding the distributions up yields the original variance. I take it then, that if I assume the distributions on Saturday and Sunday are identical, both will have a variance of 0.5 times the original one.
    $endgroup$
    – Brawling_Bear
    Dec 16 '18 at 16:54


















  • $begingroup$
    You have to consider the correlation between variables if you sum them sum up.
    $endgroup$
    – Karl
    Dec 14 '18 at 17:47










  • $begingroup$
    Yes, but let us assume that there is no correlation between the two in this case.
    $endgroup$
    – Brawling_Bear
    Dec 16 '18 at 9:37










  • $begingroup$
    @ Brawling_Bear if you have Y/2 and Y/2 and add them, there IS correlation between them, i.e. $corr(Y/2,Y/2)=1$.
    $endgroup$
    – Karl
    Dec 16 '18 at 13:16












  • $begingroup$
    Hmm, yes. I can see that. Thank you. By scaling the distribution twice, I've created two identidical variables, and thus they are perfectly correlated. Consequently, adding the distributions up yields the original variance. I take it then, that if I assume the distributions on Saturday and Sunday are identical, both will have a variance of 0.5 times the original one.
    $endgroup$
    – Brawling_Bear
    Dec 16 '18 at 16:54
















$begingroup$
You have to consider the correlation between variables if you sum them sum up.
$endgroup$
– Karl
Dec 14 '18 at 17:47




$begingroup$
You have to consider the correlation between variables if you sum them sum up.
$endgroup$
– Karl
Dec 14 '18 at 17:47












$begingroup$
Yes, but let us assume that there is no correlation between the two in this case.
$endgroup$
– Brawling_Bear
Dec 16 '18 at 9:37




$begingroup$
Yes, but let us assume that there is no correlation between the two in this case.
$endgroup$
– Brawling_Bear
Dec 16 '18 at 9:37












$begingroup$
@ Brawling_Bear if you have Y/2 and Y/2 and add them, there IS correlation between them, i.e. $corr(Y/2,Y/2)=1$.
$endgroup$
– Karl
Dec 16 '18 at 13:16






$begingroup$
@ Brawling_Bear if you have Y/2 and Y/2 and add them, there IS correlation between them, i.e. $corr(Y/2,Y/2)=1$.
$endgroup$
– Karl
Dec 16 '18 at 13:16














$begingroup$
Hmm, yes. I can see that. Thank you. By scaling the distribution twice, I've created two identidical variables, and thus they are perfectly correlated. Consequently, adding the distributions up yields the original variance. I take it then, that if I assume the distributions on Saturday and Sunday are identical, both will have a variance of 0.5 times the original one.
$endgroup$
– Brawling_Bear
Dec 16 '18 at 16:54




$begingroup$
Hmm, yes. I can see that. Thank you. By scaling the distribution twice, I've created two identidical variables, and thus they are perfectly correlated. Consequently, adding the distributions up yields the original variance. I take it then, that if I assume the distributions on Saturday and Sunday are identical, both will have a variance of 0.5 times the original one.
$endgroup$
– Brawling_Bear
Dec 16 '18 at 16:54










1 Answer
1






active

oldest

votes


















0












$begingroup$

This should not be surprising. When you divide a random variable by $2$ you reduce the variance by $4$ as you say. When you add two of the divided variables together you recover the original mean but the variance is smaller that the variance on just one. This reflects the fact that the more variables you add, the more the sum tends to cluster around the mean because some are above and some below.






share|cite|improve this answer









$endgroup$













  • $begingroup$
    Thank you for your answer. I think it makes sense. Nonetheless, would not the same have to occur if we measured variability on the total amount of people allowed in on the weekend. There is, as you mention, compensation between people being allowed in on Saturday and Sunday, but that is case regardless of whether we measure it by measuring the total amount on weekends, or we measure the amounts on Saturday and Sunday separately. I do not see why the distribution of this variable, the total amount on weekends, would differ based on the way it is measured, provided that both methods are correct.
    $endgroup$
    – Brawling_Bear
    Dec 16 '18 at 9:35










  • $begingroup$
    The above applies specifically to people/weekend compared to people/day. The variance of the weekend is twice the variance per day. The variance would be four times the variance per day if you just took Saturday's result and doubled it.
    $endgroup$
    – Ross Millikan
    Dec 16 '18 at 15:32











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3039686%2fscaling-normal-distributions-inconsistency%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0












$begingroup$

This should not be surprising. When you divide a random variable by $2$ you reduce the variance by $4$ as you say. When you add two of the divided variables together you recover the original mean but the variance is smaller that the variance on just one. This reflects the fact that the more variables you add, the more the sum tends to cluster around the mean because some are above and some below.






share|cite|improve this answer









$endgroup$













  • $begingroup$
    Thank you for your answer. I think it makes sense. Nonetheless, would not the same have to occur if we measured variability on the total amount of people allowed in on the weekend. There is, as you mention, compensation between people being allowed in on Saturday and Sunday, but that is case regardless of whether we measure it by measuring the total amount on weekends, or we measure the amounts on Saturday and Sunday separately. I do not see why the distribution of this variable, the total amount on weekends, would differ based on the way it is measured, provided that both methods are correct.
    $endgroup$
    – Brawling_Bear
    Dec 16 '18 at 9:35










  • $begingroup$
    The above applies specifically to people/weekend compared to people/day. The variance of the weekend is twice the variance per day. The variance would be four times the variance per day if you just took Saturday's result and doubled it.
    $endgroup$
    – Ross Millikan
    Dec 16 '18 at 15:32
















0












$begingroup$

This should not be surprising. When you divide a random variable by $2$ you reduce the variance by $4$ as you say. When you add two of the divided variables together you recover the original mean but the variance is smaller that the variance on just one. This reflects the fact that the more variables you add, the more the sum tends to cluster around the mean because some are above and some below.






share|cite|improve this answer









$endgroup$













  • $begingroup$
    Thank you for your answer. I think it makes sense. Nonetheless, would not the same have to occur if we measured variability on the total amount of people allowed in on the weekend. There is, as you mention, compensation between people being allowed in on Saturday and Sunday, but that is case regardless of whether we measure it by measuring the total amount on weekends, or we measure the amounts on Saturday and Sunday separately. I do not see why the distribution of this variable, the total amount on weekends, would differ based on the way it is measured, provided that both methods are correct.
    $endgroup$
    – Brawling_Bear
    Dec 16 '18 at 9:35










  • $begingroup$
    The above applies specifically to people/weekend compared to people/day. The variance of the weekend is twice the variance per day. The variance would be four times the variance per day if you just took Saturday's result and doubled it.
    $endgroup$
    – Ross Millikan
    Dec 16 '18 at 15:32














0












0








0





$begingroup$

This should not be surprising. When you divide a random variable by $2$ you reduce the variance by $4$ as you say. When you add two of the divided variables together you recover the original mean but the variance is smaller that the variance on just one. This reflects the fact that the more variables you add, the more the sum tends to cluster around the mean because some are above and some below.






share|cite|improve this answer









$endgroup$



This should not be surprising. When you divide a random variable by $2$ you reduce the variance by $4$ as you say. When you add two of the divided variables together you recover the original mean but the variance is smaller that the variance on just one. This reflects the fact that the more variables you add, the more the sum tends to cluster around the mean because some are above and some below.







share|cite|improve this answer












share|cite|improve this answer



share|cite|improve this answer










answered Dec 14 '18 at 17:46









Ross MillikanRoss Millikan

295k23198371




295k23198371












  • $begingroup$
    Thank you for your answer. I think it makes sense. Nonetheless, would not the same have to occur if we measured variability on the total amount of people allowed in on the weekend. There is, as you mention, compensation between people being allowed in on Saturday and Sunday, but that is case regardless of whether we measure it by measuring the total amount on weekends, or we measure the amounts on Saturday and Sunday separately. I do not see why the distribution of this variable, the total amount on weekends, would differ based on the way it is measured, provided that both methods are correct.
    $endgroup$
    – Brawling_Bear
    Dec 16 '18 at 9:35










  • $begingroup$
    The above applies specifically to people/weekend compared to people/day. The variance of the weekend is twice the variance per day. The variance would be four times the variance per day if you just took Saturday's result and doubled it.
    $endgroup$
    – Ross Millikan
    Dec 16 '18 at 15:32


















  • $begingroup$
    Thank you for your answer. I think it makes sense. Nonetheless, would not the same have to occur if we measured variability on the total amount of people allowed in on the weekend. There is, as you mention, compensation between people being allowed in on Saturday and Sunday, but that is case regardless of whether we measure it by measuring the total amount on weekends, or we measure the amounts on Saturday and Sunday separately. I do not see why the distribution of this variable, the total amount on weekends, would differ based on the way it is measured, provided that both methods are correct.
    $endgroup$
    – Brawling_Bear
    Dec 16 '18 at 9:35










  • $begingroup$
    The above applies specifically to people/weekend compared to people/day. The variance of the weekend is twice the variance per day. The variance would be four times the variance per day if you just took Saturday's result and doubled it.
    $endgroup$
    – Ross Millikan
    Dec 16 '18 at 15:32
















$begingroup$
Thank you for your answer. I think it makes sense. Nonetheless, would not the same have to occur if we measured variability on the total amount of people allowed in on the weekend. There is, as you mention, compensation between people being allowed in on Saturday and Sunday, but that is case regardless of whether we measure it by measuring the total amount on weekends, or we measure the amounts on Saturday and Sunday separately. I do not see why the distribution of this variable, the total amount on weekends, would differ based on the way it is measured, provided that both methods are correct.
$endgroup$
– Brawling_Bear
Dec 16 '18 at 9:35




$begingroup$
Thank you for your answer. I think it makes sense. Nonetheless, would not the same have to occur if we measured variability on the total amount of people allowed in on the weekend. There is, as you mention, compensation between people being allowed in on Saturday and Sunday, but that is case regardless of whether we measure it by measuring the total amount on weekends, or we measure the amounts on Saturday and Sunday separately. I do not see why the distribution of this variable, the total amount on weekends, would differ based on the way it is measured, provided that both methods are correct.
$endgroup$
– Brawling_Bear
Dec 16 '18 at 9:35












$begingroup$
The above applies specifically to people/weekend compared to people/day. The variance of the weekend is twice the variance per day. The variance would be four times the variance per day if you just took Saturday's result and doubled it.
$endgroup$
– Ross Millikan
Dec 16 '18 at 15:32




$begingroup$
The above applies specifically to people/weekend compared to people/day. The variance of the weekend is twice the variance per day. The variance would be four times the variance per day if you just took Saturday's result and doubled it.
$endgroup$
– Ross Millikan
Dec 16 '18 at 15:32


















draft saved

draft discarded




















































Thanks for contributing an answer to Mathematics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3039686%2fscaling-normal-distributions-inconsistency%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Quarter-circle Tiles

build a pushdown automaton that recognizes the reverse language of a given pushdown automaton?

Mont Emei