Questions Leading From Application of Orthogonal Change of Coordinates to Transform a General Gaussian PDF











up vote
0
down vote

favorite












My textbook says the following:




Given a vector $mathrm{mathbf{x}}$ of random variables $x_i$ for $i = 1, dots, N,$ with mean $bar{mathrm{mathbf{x}}} = E[mathrm{mathbf{x}}]$, where $E[cdot]$ represents the expected, and $Delta mathrm{mathbf{x}} = mathrm{mathbf{x}} - bar{mathrm{mathbf{x}}}$, the covariance matrix $Sigma$ is an $N times N$ matrix given by



$$Sigma = E[Delta mathrm{mathbf{x}} Delta mathrm{mathbf{x}}^T]$$



so that $Sigma_{i j} = E[ Delta x_i Delta x_j]$. The diagonal entries of the matrix $Sigma$ are the variances of the individual variables $x_i$, whereas the off-diagonal entries are the cross-covariance values.



The variables $x_i$ are said to conform to a joint Gaussian distribution, if the probability distribution of $mathrm{mathbf{x}}$ is of the form



$$P(bar{mathrm{mathbf{x}}} + Delta mathrm{mathbf{x}}) = (2 pi) ^{-N/2} det(Sigma^{-1})^{1/2} exp(-(Delta mathrm{mathbf{x}})^T Sigma^{-1} (Delta mathrm{mathbf{x}})/2) tag{A2.1}$$



for some positive-semidefinite matrix $Sigma^{-1}$.



$vdots$



Change of coordinates. Since $Sigma$ is symmetric and positive-definite, it may be written as $Sigma = U^TDU$, where $U$ is an orthogonal matrix and $D = (sigma_1^2, sigma_2^2, dots, sigma_N^2)$ is diagonal. Writing $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ and $bar{mathrm{mathbf{x}}}' = U bar{mathrm{mathbf{x}}}$, and substituting in (A2.1), leads to



$$ begin{align*}exp(-(mathrm{mathbf{x}} - bar{mathrm{mathbf{x}}})^T Sigma^{-1} (mathrm{mathbf{x}} - bar{mathrm{mathbf{x}}})/2) &= exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T U Sigma^{-1} U^T (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2) \ &= exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2) end{align*}$$



Thus, the orthogonal change of coordinates from $mathrm{mathbf{x}}$ to $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ transforms a general Gaussian PDF into one with diagonal covariance matrix. A further scaling by $sigma_i$ in each coordinate direction may be applied to transform it to an isotropic Gaussian distribution. Equivalently stated, a change of coordinates may be applied to transform Mahalanobis distance to ordinary Euclidean distance.




Appendix 2, Multiple View Geometry in Computer Vision by Hartley and Zisserman.



I'm having trouble understanding the following section:




Thus, the orthogonal change of coordinates from $mathrm{mathbf{x}}$ to $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ transforms a general Gaussian PDF into one with diagonal covariance matrix. A further scaling by $sigma_i$ in each coordinate direction may be applied to transform it to an isotropic Gaussian distribution. Equivalently stated, a change of coordinates may be applied to transform Mahalanobis distance to ordinary Euclidean distance.





  1. It says that the orthogonal change of coordinates from $mathrm{mathbf{x}}$ to $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ transforms a general Gaussian PDF into one with diagonal covariance matrix. But in the final expression, we have $D^{-1}$, whereas, if I'm not mistaken, the diagonal covariance matrix is $D = (sigma_1^2, sigma_2^2, dots, sigma_N^2)$; so $D^{-1}$ is not the diagonal covariance matrix, but the inverse of it. So how is it that the orthogonal change of coordinates transforms a general Gaussian PDF into one with diagonal covariance matrix? Isn't it the case that the orthogonal change of coordinates transforms a general Gaussian PDF into one with the inverse of the diagonal covariance matrix?


  2. It says that further scaling by $sigma_i$ in each coordinate direction may be applied to transform it to an isotropic Gaussian distribution. My search for information regarding what an isotropic Gaussian distribution is led me to this question, where it is stated that an isotropic Gaussian distribution is one where the covariance matrix is represented by the simplified matrix $Sigma = sigma^2 I$. Again, how does scaling $exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2)$ by $sigma_i$ transform the general Gaussian PDF into an isotropic Gaussian distribution? I don't see where the $Sigma = sigma^2 I$ would come from?


  3. I know that the Mahalanobis distance is $|| mathrm{mathbf{X}} - mathrm{mathbf{Y}}||_{Sigma} = ((mathrm{mathbf{X}} - mathrm{mathbf{Y}})^T Sigma^{-1}(mathrm{mathbf{X}} - mathrm{mathbf{Y}}))^{1/2}$, but it doesn't seem like this is the same as any of the expressions above (although, it is obviously similar)? And where is the Euclidean distance that is mentioned? My research came across the Euclidean distance matrix, but I also do not see how this is a part of any of the above expressions?



I would greatly appreciate it if people could please take the time to clarify these points.










share|cite|improve this question




























    up vote
    0
    down vote

    favorite












    My textbook says the following:




    Given a vector $mathrm{mathbf{x}}$ of random variables $x_i$ for $i = 1, dots, N,$ with mean $bar{mathrm{mathbf{x}}} = E[mathrm{mathbf{x}}]$, where $E[cdot]$ represents the expected, and $Delta mathrm{mathbf{x}} = mathrm{mathbf{x}} - bar{mathrm{mathbf{x}}}$, the covariance matrix $Sigma$ is an $N times N$ matrix given by



    $$Sigma = E[Delta mathrm{mathbf{x}} Delta mathrm{mathbf{x}}^T]$$



    so that $Sigma_{i j} = E[ Delta x_i Delta x_j]$. The diagonal entries of the matrix $Sigma$ are the variances of the individual variables $x_i$, whereas the off-diagonal entries are the cross-covariance values.



    The variables $x_i$ are said to conform to a joint Gaussian distribution, if the probability distribution of $mathrm{mathbf{x}}$ is of the form



    $$P(bar{mathrm{mathbf{x}}} + Delta mathrm{mathbf{x}}) = (2 pi) ^{-N/2} det(Sigma^{-1})^{1/2} exp(-(Delta mathrm{mathbf{x}})^T Sigma^{-1} (Delta mathrm{mathbf{x}})/2) tag{A2.1}$$



    for some positive-semidefinite matrix $Sigma^{-1}$.



    $vdots$



    Change of coordinates. Since $Sigma$ is symmetric and positive-definite, it may be written as $Sigma = U^TDU$, where $U$ is an orthogonal matrix and $D = (sigma_1^2, sigma_2^2, dots, sigma_N^2)$ is diagonal. Writing $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ and $bar{mathrm{mathbf{x}}}' = U bar{mathrm{mathbf{x}}}$, and substituting in (A2.1), leads to



    $$ begin{align*}exp(-(mathrm{mathbf{x}} - bar{mathrm{mathbf{x}}})^T Sigma^{-1} (mathrm{mathbf{x}} - bar{mathrm{mathbf{x}}})/2) &= exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T U Sigma^{-1} U^T (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2) \ &= exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2) end{align*}$$



    Thus, the orthogonal change of coordinates from $mathrm{mathbf{x}}$ to $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ transforms a general Gaussian PDF into one with diagonal covariance matrix. A further scaling by $sigma_i$ in each coordinate direction may be applied to transform it to an isotropic Gaussian distribution. Equivalently stated, a change of coordinates may be applied to transform Mahalanobis distance to ordinary Euclidean distance.




    Appendix 2, Multiple View Geometry in Computer Vision by Hartley and Zisserman.



    I'm having trouble understanding the following section:




    Thus, the orthogonal change of coordinates from $mathrm{mathbf{x}}$ to $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ transforms a general Gaussian PDF into one with diagonal covariance matrix. A further scaling by $sigma_i$ in each coordinate direction may be applied to transform it to an isotropic Gaussian distribution. Equivalently stated, a change of coordinates may be applied to transform Mahalanobis distance to ordinary Euclidean distance.





    1. It says that the orthogonal change of coordinates from $mathrm{mathbf{x}}$ to $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ transforms a general Gaussian PDF into one with diagonal covariance matrix. But in the final expression, we have $D^{-1}$, whereas, if I'm not mistaken, the diagonal covariance matrix is $D = (sigma_1^2, sigma_2^2, dots, sigma_N^2)$; so $D^{-1}$ is not the diagonal covariance matrix, but the inverse of it. So how is it that the orthogonal change of coordinates transforms a general Gaussian PDF into one with diagonal covariance matrix? Isn't it the case that the orthogonal change of coordinates transforms a general Gaussian PDF into one with the inverse of the diagonal covariance matrix?


    2. It says that further scaling by $sigma_i$ in each coordinate direction may be applied to transform it to an isotropic Gaussian distribution. My search for information regarding what an isotropic Gaussian distribution is led me to this question, where it is stated that an isotropic Gaussian distribution is one where the covariance matrix is represented by the simplified matrix $Sigma = sigma^2 I$. Again, how does scaling $exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2)$ by $sigma_i$ transform the general Gaussian PDF into an isotropic Gaussian distribution? I don't see where the $Sigma = sigma^2 I$ would come from?


    3. I know that the Mahalanobis distance is $|| mathrm{mathbf{X}} - mathrm{mathbf{Y}}||_{Sigma} = ((mathrm{mathbf{X}} - mathrm{mathbf{Y}})^T Sigma^{-1}(mathrm{mathbf{X}} - mathrm{mathbf{Y}}))^{1/2}$, but it doesn't seem like this is the same as any of the expressions above (although, it is obviously similar)? And where is the Euclidean distance that is mentioned? My research came across the Euclidean distance matrix, but I also do not see how this is a part of any of the above expressions?



    I would greatly appreciate it if people could please take the time to clarify these points.










    share|cite|improve this question


























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      My textbook says the following:




      Given a vector $mathrm{mathbf{x}}$ of random variables $x_i$ for $i = 1, dots, N,$ with mean $bar{mathrm{mathbf{x}}} = E[mathrm{mathbf{x}}]$, where $E[cdot]$ represents the expected, and $Delta mathrm{mathbf{x}} = mathrm{mathbf{x}} - bar{mathrm{mathbf{x}}}$, the covariance matrix $Sigma$ is an $N times N$ matrix given by



      $$Sigma = E[Delta mathrm{mathbf{x}} Delta mathrm{mathbf{x}}^T]$$



      so that $Sigma_{i j} = E[ Delta x_i Delta x_j]$. The diagonal entries of the matrix $Sigma$ are the variances of the individual variables $x_i$, whereas the off-diagonal entries are the cross-covariance values.



      The variables $x_i$ are said to conform to a joint Gaussian distribution, if the probability distribution of $mathrm{mathbf{x}}$ is of the form



      $$P(bar{mathrm{mathbf{x}}} + Delta mathrm{mathbf{x}}) = (2 pi) ^{-N/2} det(Sigma^{-1})^{1/2} exp(-(Delta mathrm{mathbf{x}})^T Sigma^{-1} (Delta mathrm{mathbf{x}})/2) tag{A2.1}$$



      for some positive-semidefinite matrix $Sigma^{-1}$.



      $vdots$



      Change of coordinates. Since $Sigma$ is symmetric and positive-definite, it may be written as $Sigma = U^TDU$, where $U$ is an orthogonal matrix and $D = (sigma_1^2, sigma_2^2, dots, sigma_N^2)$ is diagonal. Writing $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ and $bar{mathrm{mathbf{x}}}' = U bar{mathrm{mathbf{x}}}$, and substituting in (A2.1), leads to



      $$ begin{align*}exp(-(mathrm{mathbf{x}} - bar{mathrm{mathbf{x}}})^T Sigma^{-1} (mathrm{mathbf{x}} - bar{mathrm{mathbf{x}}})/2) &= exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T U Sigma^{-1} U^T (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2) \ &= exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2) end{align*}$$



      Thus, the orthogonal change of coordinates from $mathrm{mathbf{x}}$ to $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ transforms a general Gaussian PDF into one with diagonal covariance matrix. A further scaling by $sigma_i$ in each coordinate direction may be applied to transform it to an isotropic Gaussian distribution. Equivalently stated, a change of coordinates may be applied to transform Mahalanobis distance to ordinary Euclidean distance.




      Appendix 2, Multiple View Geometry in Computer Vision by Hartley and Zisserman.



      I'm having trouble understanding the following section:




      Thus, the orthogonal change of coordinates from $mathrm{mathbf{x}}$ to $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ transforms a general Gaussian PDF into one with diagonal covariance matrix. A further scaling by $sigma_i$ in each coordinate direction may be applied to transform it to an isotropic Gaussian distribution. Equivalently stated, a change of coordinates may be applied to transform Mahalanobis distance to ordinary Euclidean distance.





      1. It says that the orthogonal change of coordinates from $mathrm{mathbf{x}}$ to $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ transforms a general Gaussian PDF into one with diagonal covariance matrix. But in the final expression, we have $D^{-1}$, whereas, if I'm not mistaken, the diagonal covariance matrix is $D = (sigma_1^2, sigma_2^2, dots, sigma_N^2)$; so $D^{-1}$ is not the diagonal covariance matrix, but the inverse of it. So how is it that the orthogonal change of coordinates transforms a general Gaussian PDF into one with diagonal covariance matrix? Isn't it the case that the orthogonal change of coordinates transforms a general Gaussian PDF into one with the inverse of the diagonal covariance matrix?


      2. It says that further scaling by $sigma_i$ in each coordinate direction may be applied to transform it to an isotropic Gaussian distribution. My search for information regarding what an isotropic Gaussian distribution is led me to this question, where it is stated that an isotropic Gaussian distribution is one where the covariance matrix is represented by the simplified matrix $Sigma = sigma^2 I$. Again, how does scaling $exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2)$ by $sigma_i$ transform the general Gaussian PDF into an isotropic Gaussian distribution? I don't see where the $Sigma = sigma^2 I$ would come from?


      3. I know that the Mahalanobis distance is $|| mathrm{mathbf{X}} - mathrm{mathbf{Y}}||_{Sigma} = ((mathrm{mathbf{X}} - mathrm{mathbf{Y}})^T Sigma^{-1}(mathrm{mathbf{X}} - mathrm{mathbf{Y}}))^{1/2}$, but it doesn't seem like this is the same as any of the expressions above (although, it is obviously similar)? And where is the Euclidean distance that is mentioned? My research came across the Euclidean distance matrix, but I also do not see how this is a part of any of the above expressions?



      I would greatly appreciate it if people could please take the time to clarify these points.










      share|cite|improve this question















      My textbook says the following:




      Given a vector $mathrm{mathbf{x}}$ of random variables $x_i$ for $i = 1, dots, N,$ with mean $bar{mathrm{mathbf{x}}} = E[mathrm{mathbf{x}}]$, where $E[cdot]$ represents the expected, and $Delta mathrm{mathbf{x}} = mathrm{mathbf{x}} - bar{mathrm{mathbf{x}}}$, the covariance matrix $Sigma$ is an $N times N$ matrix given by



      $$Sigma = E[Delta mathrm{mathbf{x}} Delta mathrm{mathbf{x}}^T]$$



      so that $Sigma_{i j} = E[ Delta x_i Delta x_j]$. The diagonal entries of the matrix $Sigma$ are the variances of the individual variables $x_i$, whereas the off-diagonal entries are the cross-covariance values.



      The variables $x_i$ are said to conform to a joint Gaussian distribution, if the probability distribution of $mathrm{mathbf{x}}$ is of the form



      $$P(bar{mathrm{mathbf{x}}} + Delta mathrm{mathbf{x}}) = (2 pi) ^{-N/2} det(Sigma^{-1})^{1/2} exp(-(Delta mathrm{mathbf{x}})^T Sigma^{-1} (Delta mathrm{mathbf{x}})/2) tag{A2.1}$$



      for some positive-semidefinite matrix $Sigma^{-1}$.



      $vdots$



      Change of coordinates. Since $Sigma$ is symmetric and positive-definite, it may be written as $Sigma = U^TDU$, where $U$ is an orthogonal matrix and $D = (sigma_1^2, sigma_2^2, dots, sigma_N^2)$ is diagonal. Writing $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ and $bar{mathrm{mathbf{x}}}' = U bar{mathrm{mathbf{x}}}$, and substituting in (A2.1), leads to



      $$ begin{align*}exp(-(mathrm{mathbf{x}} - bar{mathrm{mathbf{x}}})^T Sigma^{-1} (mathrm{mathbf{x}} - bar{mathrm{mathbf{x}}})/2) &= exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T U Sigma^{-1} U^T (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2) \ &= exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2) end{align*}$$



      Thus, the orthogonal change of coordinates from $mathrm{mathbf{x}}$ to $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ transforms a general Gaussian PDF into one with diagonal covariance matrix. A further scaling by $sigma_i$ in each coordinate direction may be applied to transform it to an isotropic Gaussian distribution. Equivalently stated, a change of coordinates may be applied to transform Mahalanobis distance to ordinary Euclidean distance.




      Appendix 2, Multiple View Geometry in Computer Vision by Hartley and Zisserman.



      I'm having trouble understanding the following section:




      Thus, the orthogonal change of coordinates from $mathrm{mathbf{x}}$ to $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ transforms a general Gaussian PDF into one with diagonal covariance matrix. A further scaling by $sigma_i$ in each coordinate direction may be applied to transform it to an isotropic Gaussian distribution. Equivalently stated, a change of coordinates may be applied to transform Mahalanobis distance to ordinary Euclidean distance.





      1. It says that the orthogonal change of coordinates from $mathrm{mathbf{x}}$ to $mathrm{mathbf{x}}' = U mathrm{mathbf{x}}$ transforms a general Gaussian PDF into one with diagonal covariance matrix. But in the final expression, we have $D^{-1}$, whereas, if I'm not mistaken, the diagonal covariance matrix is $D = (sigma_1^2, sigma_2^2, dots, sigma_N^2)$; so $D^{-1}$ is not the diagonal covariance matrix, but the inverse of it. So how is it that the orthogonal change of coordinates transforms a general Gaussian PDF into one with diagonal covariance matrix? Isn't it the case that the orthogonal change of coordinates transforms a general Gaussian PDF into one with the inverse of the diagonal covariance matrix?


      2. It says that further scaling by $sigma_i$ in each coordinate direction may be applied to transform it to an isotropic Gaussian distribution. My search for information regarding what an isotropic Gaussian distribution is led me to this question, where it is stated that an isotropic Gaussian distribution is one where the covariance matrix is represented by the simplified matrix $Sigma = sigma^2 I$. Again, how does scaling $exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2)$ by $sigma_i$ transform the general Gaussian PDF into an isotropic Gaussian distribution? I don't see where the $Sigma = sigma^2 I$ would come from?


      3. I know that the Mahalanobis distance is $|| mathrm{mathbf{X}} - mathrm{mathbf{Y}}||_{Sigma} = ((mathrm{mathbf{X}} - mathrm{mathbf{Y}})^T Sigma^{-1}(mathrm{mathbf{X}} - mathrm{mathbf{Y}}))^{1/2}$, but it doesn't seem like this is the same as any of the expressions above (although, it is obviously similar)? And where is the Euclidean distance that is mentioned? My research came across the Euclidean distance matrix, but I also do not see how this is a part of any of the above expressions?



      I would greatly appreciate it if people could please take the time to clarify these points.







      linear-algebra probability statistics metric-spaces normal-distribution






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Nov 22 at 17:00

























      asked Nov 22 at 15:36









      The Pointer

      2,56021333




      2,56021333






















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          2
          down vote



          accepted











          1. You're misinterpreting what they're saying about the covariance matrix. They're not saying whether it's $D$ or $D^{-1}$, nor which shows up in the middle of the exponent in the pdf. The statement is simply that the covariance matrix is now diagonal.

          2. The point is that suitable choices of scaling ensure the quadratic function reduces to a sum of squares, all with the same coefficient (which is easiest to consider when it gives $Sigma=I$). Explicitly if $y_i:=sqrt{(D^{-1})_{ii}}(x_i'-bar{x}_i')$ the $y$-space pdf is proportional to $exp -y^Ty/2$.

          3. The pdf is proportional to $exp -frac{1}{2}Vert X-YVert_Sigma^2$. The Euclidean distance is the $Sigma=I$ special case of the Mahalanobis distance.






          share|cite|improve this answer























          • 1. Ahh, I think I understand: They're saying that the covariance matrix for ("belonging to") the transformed PDF $f(mathrm{mathbf{x}}) = exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2)$ is diagonal, assuming one were to calculate it (which is not done here)?
            – The Pointer
            Nov 22 at 15:56












          • @ThePointer Right. $D^{-1}$ just generalises $frac{1}{sigma^2}$, so it's kind of like saying the univariate case doesn't calculate the variance, but rather its reciprocal.
            – J.G.
            Nov 22 at 15:57










          • Ok, thanks for that. Can you please elaborate on 2? I'm struggling to understand this.
            – The Pointer
            Nov 22 at 16:40










          • @ThePointer See my edit.
            – J.G.
            Nov 22 at 16:43










          • I think I understand it all now. The key was to interpret 2 and 3 within the context of the correct interpretation of 1. Thank you for the assistance. :)
            – The Pointer
            Nov 22 at 16:57













          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "69"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          noCode: true, onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3009277%2fquestions-leading-from-application-of-orthogonal-change-of-coordinates-to-transf%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          2
          down vote



          accepted











          1. You're misinterpreting what they're saying about the covariance matrix. They're not saying whether it's $D$ or $D^{-1}$, nor which shows up in the middle of the exponent in the pdf. The statement is simply that the covariance matrix is now diagonal.

          2. The point is that suitable choices of scaling ensure the quadratic function reduces to a sum of squares, all with the same coefficient (which is easiest to consider when it gives $Sigma=I$). Explicitly if $y_i:=sqrt{(D^{-1})_{ii}}(x_i'-bar{x}_i')$ the $y$-space pdf is proportional to $exp -y^Ty/2$.

          3. The pdf is proportional to $exp -frac{1}{2}Vert X-YVert_Sigma^2$. The Euclidean distance is the $Sigma=I$ special case of the Mahalanobis distance.






          share|cite|improve this answer























          • 1. Ahh, I think I understand: They're saying that the covariance matrix for ("belonging to") the transformed PDF $f(mathrm{mathbf{x}}) = exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2)$ is diagonal, assuming one were to calculate it (which is not done here)?
            – The Pointer
            Nov 22 at 15:56












          • @ThePointer Right. $D^{-1}$ just generalises $frac{1}{sigma^2}$, so it's kind of like saying the univariate case doesn't calculate the variance, but rather its reciprocal.
            – J.G.
            Nov 22 at 15:57










          • Ok, thanks for that. Can you please elaborate on 2? I'm struggling to understand this.
            – The Pointer
            Nov 22 at 16:40










          • @ThePointer See my edit.
            – J.G.
            Nov 22 at 16:43










          • I think I understand it all now. The key was to interpret 2 and 3 within the context of the correct interpretation of 1. Thank you for the assistance. :)
            – The Pointer
            Nov 22 at 16:57

















          up vote
          2
          down vote



          accepted











          1. You're misinterpreting what they're saying about the covariance matrix. They're not saying whether it's $D$ or $D^{-1}$, nor which shows up in the middle of the exponent in the pdf. The statement is simply that the covariance matrix is now diagonal.

          2. The point is that suitable choices of scaling ensure the quadratic function reduces to a sum of squares, all with the same coefficient (which is easiest to consider when it gives $Sigma=I$). Explicitly if $y_i:=sqrt{(D^{-1})_{ii}}(x_i'-bar{x}_i')$ the $y$-space pdf is proportional to $exp -y^Ty/2$.

          3. The pdf is proportional to $exp -frac{1}{2}Vert X-YVert_Sigma^2$. The Euclidean distance is the $Sigma=I$ special case of the Mahalanobis distance.






          share|cite|improve this answer























          • 1. Ahh, I think I understand: They're saying that the covariance matrix for ("belonging to") the transformed PDF $f(mathrm{mathbf{x}}) = exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2)$ is diagonal, assuming one were to calculate it (which is not done here)?
            – The Pointer
            Nov 22 at 15:56












          • @ThePointer Right. $D^{-1}$ just generalises $frac{1}{sigma^2}$, so it's kind of like saying the univariate case doesn't calculate the variance, but rather its reciprocal.
            – J.G.
            Nov 22 at 15:57










          • Ok, thanks for that. Can you please elaborate on 2? I'm struggling to understand this.
            – The Pointer
            Nov 22 at 16:40










          • @ThePointer See my edit.
            – J.G.
            Nov 22 at 16:43










          • I think I understand it all now. The key was to interpret 2 and 3 within the context of the correct interpretation of 1. Thank you for the assistance. :)
            – The Pointer
            Nov 22 at 16:57















          up vote
          2
          down vote



          accepted







          up vote
          2
          down vote



          accepted







          1. You're misinterpreting what they're saying about the covariance matrix. They're not saying whether it's $D$ or $D^{-1}$, nor which shows up in the middle of the exponent in the pdf. The statement is simply that the covariance matrix is now diagonal.

          2. The point is that suitable choices of scaling ensure the quadratic function reduces to a sum of squares, all with the same coefficient (which is easiest to consider when it gives $Sigma=I$). Explicitly if $y_i:=sqrt{(D^{-1})_{ii}}(x_i'-bar{x}_i')$ the $y$-space pdf is proportional to $exp -y^Ty/2$.

          3. The pdf is proportional to $exp -frac{1}{2}Vert X-YVert_Sigma^2$. The Euclidean distance is the $Sigma=I$ special case of the Mahalanobis distance.






          share|cite|improve this answer















          1. You're misinterpreting what they're saying about the covariance matrix. They're not saying whether it's $D$ or $D^{-1}$, nor which shows up in the middle of the exponent in the pdf. The statement is simply that the covariance matrix is now diagonal.

          2. The point is that suitable choices of scaling ensure the quadratic function reduces to a sum of squares, all with the same coefficient (which is easiest to consider when it gives $Sigma=I$). Explicitly if $y_i:=sqrt{(D^{-1})_{ii}}(x_i'-bar{x}_i')$ the $y$-space pdf is proportional to $exp -y^Ty/2$.

          3. The pdf is proportional to $exp -frac{1}{2}Vert X-YVert_Sigma^2$. The Euclidean distance is the $Sigma=I$ special case of the Mahalanobis distance.







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited Nov 22 at 16:43

























          answered Nov 22 at 15:50









          J.G.

          21.1k21933




          21.1k21933












          • 1. Ahh, I think I understand: They're saying that the covariance matrix for ("belonging to") the transformed PDF $f(mathrm{mathbf{x}}) = exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2)$ is diagonal, assuming one were to calculate it (which is not done here)?
            – The Pointer
            Nov 22 at 15:56












          • @ThePointer Right. $D^{-1}$ just generalises $frac{1}{sigma^2}$, so it's kind of like saying the univariate case doesn't calculate the variance, but rather its reciprocal.
            – J.G.
            Nov 22 at 15:57










          • Ok, thanks for that. Can you please elaborate on 2? I'm struggling to understand this.
            – The Pointer
            Nov 22 at 16:40










          • @ThePointer See my edit.
            – J.G.
            Nov 22 at 16:43










          • I think I understand it all now. The key was to interpret 2 and 3 within the context of the correct interpretation of 1. Thank you for the assistance. :)
            – The Pointer
            Nov 22 at 16:57




















          • 1. Ahh, I think I understand: They're saying that the covariance matrix for ("belonging to") the transformed PDF $f(mathrm{mathbf{x}}) = exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2)$ is diagonal, assuming one were to calculate it (which is not done here)?
            – The Pointer
            Nov 22 at 15:56












          • @ThePointer Right. $D^{-1}$ just generalises $frac{1}{sigma^2}$, so it's kind of like saying the univariate case doesn't calculate the variance, but rather its reciprocal.
            – J.G.
            Nov 22 at 15:57










          • Ok, thanks for that. Can you please elaborate on 2? I'm struggling to understand this.
            – The Pointer
            Nov 22 at 16:40










          • @ThePointer See my edit.
            – J.G.
            Nov 22 at 16:43










          • I think I understand it all now. The key was to interpret 2 and 3 within the context of the correct interpretation of 1. Thank you for the assistance. :)
            – The Pointer
            Nov 22 at 16:57


















          1. Ahh, I think I understand: They're saying that the covariance matrix for ("belonging to") the transformed PDF $f(mathrm{mathbf{x}}) = exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2)$ is diagonal, assuming one were to calculate it (which is not done here)?
          – The Pointer
          Nov 22 at 15:56






          1. Ahh, I think I understand: They're saying that the covariance matrix for ("belonging to") the transformed PDF $f(mathrm{mathbf{x}}) = exp(-(mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')^T D^{-1} (mathrm{mathbf{x}}' - bar{mathrm{mathbf{x}}}')/2)$ is diagonal, assuming one were to calculate it (which is not done here)?
          – The Pointer
          Nov 22 at 15:56














          @ThePointer Right. $D^{-1}$ just generalises $frac{1}{sigma^2}$, so it's kind of like saying the univariate case doesn't calculate the variance, but rather its reciprocal.
          – J.G.
          Nov 22 at 15:57




          @ThePointer Right. $D^{-1}$ just generalises $frac{1}{sigma^2}$, so it's kind of like saying the univariate case doesn't calculate the variance, but rather its reciprocal.
          – J.G.
          Nov 22 at 15:57












          Ok, thanks for that. Can you please elaborate on 2? I'm struggling to understand this.
          – The Pointer
          Nov 22 at 16:40




          Ok, thanks for that. Can you please elaborate on 2? I'm struggling to understand this.
          – The Pointer
          Nov 22 at 16:40












          @ThePointer See my edit.
          – J.G.
          Nov 22 at 16:43




          @ThePointer See my edit.
          – J.G.
          Nov 22 at 16:43












          I think I understand it all now. The key was to interpret 2 and 3 within the context of the correct interpretation of 1. Thank you for the assistance. :)
          – The Pointer
          Nov 22 at 16:57






          I think I understand it all now. The key was to interpret 2 and 3 within the context of the correct interpretation of 1. Thank you for the assistance. :)
          – The Pointer
          Nov 22 at 16:57




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Mathematics Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3009277%2fquestions-leading-from-application-of-orthogonal-change-of-coordinates-to-transf%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Quarter-circle Tiles

          build a pushdown automaton that recognizes the reverse language of a given pushdown automaton?

          Mont Emei