MLE for $Sigma$ based on sequences of observations












0














Given $bf{x}$$_1$,...,$bf{x}$$_n$ be a sequence of random vectors that are independent and identically distributed from $N_p(mu_0,Sigma)$ where $mu_0$ is known.



(i) Show that the MLE for $Sigma$ is;



$$
widehat{Sigma}=frac{1}{n}sumlimits_{i=1}^n(mathbf{x}_i-mu_0)(mathbf{x}_i-mu_0)'
$$



(ii) Now let $bf{y}$$_1$,...,$bf{y}$$_m$ be another sequence of random vectors that are independent and identically distributed from $N_p(mu_1,Sigma)$ where $mu_1$ is known. Calculate the MLE for $Sigma$ based on the sequences of observations {$bf{x}$$_1$,...,$bf{x}$$_n$} and {$bf{y}$$_1$,...,$bf{y}$$_m$}. What happens to $Sigma$ if both $mu_i$ for $i=0,1$ are assumed to be unknown?



I do not have any questions on part (i), I have already shown that the MLE for $Sigma$ is indeed $widehat{Sigma}$. My concerns are towards part (ii). I don't quite understand in what ways $widehat{Sigma}$ will change under this scenario and how to represent it notation-wise. Also, when $mu_i$ for $i=0,1$ are assumed to be unknown; do we simply replace $mu_0$ and $mu_1$ in the new MLE for $Sigma$ by $hat{mathbf{mu}}_0$ and $hat{mathbf{mu}}_1$?



Any form of help is much appreciated.










share|cite|improve this question






















  • I think you need to find the joint probability $$p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$$ and extract the covariance matrix from that.
    – BlackMath
    Nov 26 at 3:31












  • Thus, would claiming $(mathbf{x}_1,...,mathbf{x}_n,mathbf{y}_1,...,mathbf{y}_n) sim N_p(nmu_0+mmu_1,(n+m)Sigma)$ and then considering the MLE for $(n+m)Sigma$ be the correct approach?
    – Nelly
    Nov 26 at 5:39










  • How did you find the covariance in part (i)? I would assume that you found the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n)$, which is a multivariate Gaussian distribution, and from there you found the covariance matrix. Here I think it's the same thing, but you need to find the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$ instead. Since these are independent, this can be written as $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$
    – BlackMath
    Nov 26 at 6:00












  • In part (i) I derived $widehat{Sigma}$ by using the maximum likelihood method of finding the log-likelihood, taking the partial derivative with respect to $Sigma^{-1}$ and setting the equation equal to 0. I didn't necessarily find $Sigma$ as you're suggesting. Maybe I'm just not following what you're saying?
    – Nelly
    Nov 26 at 7:02








  • 1




    I'm describing the same thing, I think. You need first to find the joint PDF, find the log-likelihood of that function, and then do the derivation. The joint PDF in the first case is $$prod_ip(mathbf{x}_i)$$ while in the second case it will be $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$ where $$p(mathbf{x}_i)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{x}_i-mu_0)^TSigma^{-1}(mathbf{x}_i-mu_0)right)$$ and $$p(mathbf{y}_j)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{y}_j-mu_1)^TSigma^{-1}(mathbf{y}_j-mu_1)right)$$
    – BlackMath
    Nov 26 at 7:13


















0














Given $bf{x}$$_1$,...,$bf{x}$$_n$ be a sequence of random vectors that are independent and identically distributed from $N_p(mu_0,Sigma)$ where $mu_0$ is known.



(i) Show that the MLE for $Sigma$ is;



$$
widehat{Sigma}=frac{1}{n}sumlimits_{i=1}^n(mathbf{x}_i-mu_0)(mathbf{x}_i-mu_0)'
$$



(ii) Now let $bf{y}$$_1$,...,$bf{y}$$_m$ be another sequence of random vectors that are independent and identically distributed from $N_p(mu_1,Sigma)$ where $mu_1$ is known. Calculate the MLE for $Sigma$ based on the sequences of observations {$bf{x}$$_1$,...,$bf{x}$$_n$} and {$bf{y}$$_1$,...,$bf{y}$$_m$}. What happens to $Sigma$ if both $mu_i$ for $i=0,1$ are assumed to be unknown?



I do not have any questions on part (i), I have already shown that the MLE for $Sigma$ is indeed $widehat{Sigma}$. My concerns are towards part (ii). I don't quite understand in what ways $widehat{Sigma}$ will change under this scenario and how to represent it notation-wise. Also, when $mu_i$ for $i=0,1$ are assumed to be unknown; do we simply replace $mu_0$ and $mu_1$ in the new MLE for $Sigma$ by $hat{mathbf{mu}}_0$ and $hat{mathbf{mu}}_1$?



Any form of help is much appreciated.










share|cite|improve this question






















  • I think you need to find the joint probability $$p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$$ and extract the covariance matrix from that.
    – BlackMath
    Nov 26 at 3:31












  • Thus, would claiming $(mathbf{x}_1,...,mathbf{x}_n,mathbf{y}_1,...,mathbf{y}_n) sim N_p(nmu_0+mmu_1,(n+m)Sigma)$ and then considering the MLE for $(n+m)Sigma$ be the correct approach?
    – Nelly
    Nov 26 at 5:39










  • How did you find the covariance in part (i)? I would assume that you found the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n)$, which is a multivariate Gaussian distribution, and from there you found the covariance matrix. Here I think it's the same thing, but you need to find the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$ instead. Since these are independent, this can be written as $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$
    – BlackMath
    Nov 26 at 6:00












  • In part (i) I derived $widehat{Sigma}$ by using the maximum likelihood method of finding the log-likelihood, taking the partial derivative with respect to $Sigma^{-1}$ and setting the equation equal to 0. I didn't necessarily find $Sigma$ as you're suggesting. Maybe I'm just not following what you're saying?
    – Nelly
    Nov 26 at 7:02








  • 1




    I'm describing the same thing, I think. You need first to find the joint PDF, find the log-likelihood of that function, and then do the derivation. The joint PDF in the first case is $$prod_ip(mathbf{x}_i)$$ while in the second case it will be $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$ where $$p(mathbf{x}_i)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{x}_i-mu_0)^TSigma^{-1}(mathbf{x}_i-mu_0)right)$$ and $$p(mathbf{y}_j)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{y}_j-mu_1)^TSigma^{-1}(mathbf{y}_j-mu_1)right)$$
    – BlackMath
    Nov 26 at 7:13
















0












0








0







Given $bf{x}$$_1$,...,$bf{x}$$_n$ be a sequence of random vectors that are independent and identically distributed from $N_p(mu_0,Sigma)$ where $mu_0$ is known.



(i) Show that the MLE for $Sigma$ is;



$$
widehat{Sigma}=frac{1}{n}sumlimits_{i=1}^n(mathbf{x}_i-mu_0)(mathbf{x}_i-mu_0)'
$$



(ii) Now let $bf{y}$$_1$,...,$bf{y}$$_m$ be another sequence of random vectors that are independent and identically distributed from $N_p(mu_1,Sigma)$ where $mu_1$ is known. Calculate the MLE for $Sigma$ based on the sequences of observations {$bf{x}$$_1$,...,$bf{x}$$_n$} and {$bf{y}$$_1$,...,$bf{y}$$_m$}. What happens to $Sigma$ if both $mu_i$ for $i=0,1$ are assumed to be unknown?



I do not have any questions on part (i), I have already shown that the MLE for $Sigma$ is indeed $widehat{Sigma}$. My concerns are towards part (ii). I don't quite understand in what ways $widehat{Sigma}$ will change under this scenario and how to represent it notation-wise. Also, when $mu_i$ for $i=0,1$ are assumed to be unknown; do we simply replace $mu_0$ and $mu_1$ in the new MLE for $Sigma$ by $hat{mathbf{mu}}_0$ and $hat{mathbf{mu}}_1$?



Any form of help is much appreciated.










share|cite|improve this question













Given $bf{x}$$_1$,...,$bf{x}$$_n$ be a sequence of random vectors that are independent and identically distributed from $N_p(mu_0,Sigma)$ where $mu_0$ is known.



(i) Show that the MLE for $Sigma$ is;



$$
widehat{Sigma}=frac{1}{n}sumlimits_{i=1}^n(mathbf{x}_i-mu_0)(mathbf{x}_i-mu_0)'
$$



(ii) Now let $bf{y}$$_1$,...,$bf{y}$$_m$ be another sequence of random vectors that are independent and identically distributed from $N_p(mu_1,Sigma)$ where $mu_1$ is known. Calculate the MLE for $Sigma$ based on the sequences of observations {$bf{x}$$_1$,...,$bf{x}$$_n$} and {$bf{y}$$_1$,...,$bf{y}$$_m$}. What happens to $Sigma$ if both $mu_i$ for $i=0,1$ are assumed to be unknown?



I do not have any questions on part (i), I have already shown that the MLE for $Sigma$ is indeed $widehat{Sigma}$. My concerns are towards part (ii). I don't quite understand in what ways $widehat{Sigma}$ will change under this scenario and how to represent it notation-wise. Also, when $mu_i$ for $i=0,1$ are assumed to be unknown; do we simply replace $mu_0$ and $mu_1$ in the new MLE for $Sigma$ by $hat{mathbf{mu}}_0$ and $hat{mathbf{mu}}_1$?



Any form of help is much appreciated.







statistics normal-distribution maximum-likelihood






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Nov 26 at 2:28









Nelly

74110




74110












  • I think you need to find the joint probability $$p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$$ and extract the covariance matrix from that.
    – BlackMath
    Nov 26 at 3:31












  • Thus, would claiming $(mathbf{x}_1,...,mathbf{x}_n,mathbf{y}_1,...,mathbf{y}_n) sim N_p(nmu_0+mmu_1,(n+m)Sigma)$ and then considering the MLE for $(n+m)Sigma$ be the correct approach?
    – Nelly
    Nov 26 at 5:39










  • How did you find the covariance in part (i)? I would assume that you found the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n)$, which is a multivariate Gaussian distribution, and from there you found the covariance matrix. Here I think it's the same thing, but you need to find the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$ instead. Since these are independent, this can be written as $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$
    – BlackMath
    Nov 26 at 6:00












  • In part (i) I derived $widehat{Sigma}$ by using the maximum likelihood method of finding the log-likelihood, taking the partial derivative with respect to $Sigma^{-1}$ and setting the equation equal to 0. I didn't necessarily find $Sigma$ as you're suggesting. Maybe I'm just not following what you're saying?
    – Nelly
    Nov 26 at 7:02








  • 1




    I'm describing the same thing, I think. You need first to find the joint PDF, find the log-likelihood of that function, and then do the derivation. The joint PDF in the first case is $$prod_ip(mathbf{x}_i)$$ while in the second case it will be $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$ where $$p(mathbf{x}_i)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{x}_i-mu_0)^TSigma^{-1}(mathbf{x}_i-mu_0)right)$$ and $$p(mathbf{y}_j)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{y}_j-mu_1)^TSigma^{-1}(mathbf{y}_j-mu_1)right)$$
    – BlackMath
    Nov 26 at 7:13




















  • I think you need to find the joint probability $$p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$$ and extract the covariance matrix from that.
    – BlackMath
    Nov 26 at 3:31












  • Thus, would claiming $(mathbf{x}_1,...,mathbf{x}_n,mathbf{y}_1,...,mathbf{y}_n) sim N_p(nmu_0+mmu_1,(n+m)Sigma)$ and then considering the MLE for $(n+m)Sigma$ be the correct approach?
    – Nelly
    Nov 26 at 5:39










  • How did you find the covariance in part (i)? I would assume that you found the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n)$, which is a multivariate Gaussian distribution, and from there you found the covariance matrix. Here I think it's the same thing, but you need to find the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$ instead. Since these are independent, this can be written as $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$
    – BlackMath
    Nov 26 at 6:00












  • In part (i) I derived $widehat{Sigma}$ by using the maximum likelihood method of finding the log-likelihood, taking the partial derivative with respect to $Sigma^{-1}$ and setting the equation equal to 0. I didn't necessarily find $Sigma$ as you're suggesting. Maybe I'm just not following what you're saying?
    – Nelly
    Nov 26 at 7:02








  • 1




    I'm describing the same thing, I think. You need first to find the joint PDF, find the log-likelihood of that function, and then do the derivation. The joint PDF in the first case is $$prod_ip(mathbf{x}_i)$$ while in the second case it will be $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$ where $$p(mathbf{x}_i)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{x}_i-mu_0)^TSigma^{-1}(mathbf{x}_i-mu_0)right)$$ and $$p(mathbf{y}_j)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{y}_j-mu_1)^TSigma^{-1}(mathbf{y}_j-mu_1)right)$$
    – BlackMath
    Nov 26 at 7:13


















I think you need to find the joint probability $$p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$$ and extract the covariance matrix from that.
– BlackMath
Nov 26 at 3:31






I think you need to find the joint probability $$p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$$ and extract the covariance matrix from that.
– BlackMath
Nov 26 at 3:31














Thus, would claiming $(mathbf{x}_1,...,mathbf{x}_n,mathbf{y}_1,...,mathbf{y}_n) sim N_p(nmu_0+mmu_1,(n+m)Sigma)$ and then considering the MLE for $(n+m)Sigma$ be the correct approach?
– Nelly
Nov 26 at 5:39




Thus, would claiming $(mathbf{x}_1,...,mathbf{x}_n,mathbf{y}_1,...,mathbf{y}_n) sim N_p(nmu_0+mmu_1,(n+m)Sigma)$ and then considering the MLE for $(n+m)Sigma$ be the correct approach?
– Nelly
Nov 26 at 5:39












How did you find the covariance in part (i)? I would assume that you found the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n)$, which is a multivariate Gaussian distribution, and from there you found the covariance matrix. Here I think it's the same thing, but you need to find the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$ instead. Since these are independent, this can be written as $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$
– BlackMath
Nov 26 at 6:00






How did you find the covariance in part (i)? I would assume that you found the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n)$, which is a multivariate Gaussian distribution, and from there you found the covariance matrix. Here I think it's the same thing, but you need to find the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$ instead. Since these are independent, this can be written as $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$
– BlackMath
Nov 26 at 6:00














In part (i) I derived $widehat{Sigma}$ by using the maximum likelihood method of finding the log-likelihood, taking the partial derivative with respect to $Sigma^{-1}$ and setting the equation equal to 0. I didn't necessarily find $Sigma$ as you're suggesting. Maybe I'm just not following what you're saying?
– Nelly
Nov 26 at 7:02






In part (i) I derived $widehat{Sigma}$ by using the maximum likelihood method of finding the log-likelihood, taking the partial derivative with respect to $Sigma^{-1}$ and setting the equation equal to 0. I didn't necessarily find $Sigma$ as you're suggesting. Maybe I'm just not following what you're saying?
– Nelly
Nov 26 at 7:02






1




1




I'm describing the same thing, I think. You need first to find the joint PDF, find the log-likelihood of that function, and then do the derivation. The joint PDF in the first case is $$prod_ip(mathbf{x}_i)$$ while in the second case it will be $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$ where $$p(mathbf{x}_i)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{x}_i-mu_0)^TSigma^{-1}(mathbf{x}_i-mu_0)right)$$ and $$p(mathbf{y}_j)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{y}_j-mu_1)^TSigma^{-1}(mathbf{y}_j-mu_1)right)$$
– BlackMath
Nov 26 at 7:13






I'm describing the same thing, I think. You need first to find the joint PDF, find the log-likelihood of that function, and then do the derivation. The joint PDF in the first case is $$prod_ip(mathbf{x}_i)$$ while in the second case it will be $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$ where $$p(mathbf{x}_i)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{x}_i-mu_0)^TSigma^{-1}(mathbf{x}_i-mu_0)right)$$ and $$p(mathbf{y}_j)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{y}_j-mu_1)^TSigma^{-1}(mathbf{y}_j-mu_1)right)$$
– BlackMath
Nov 26 at 7:13

















active

oldest

votes











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3013730%2fmle-for-sigma-based-on-sequences-of-observations%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Mathematics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3013730%2fmle-for-sigma-based-on-sequences-of-observations%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Quarter-circle Tiles

build a pushdown automaton that recognizes the reverse language of a given pushdown automaton?

Mont Emei