What is the scalar derivative?












1












$begingroup$


I quote paragraph 2.5 of The Matrix Cookbook document: Assume $F(X)$ to be a differentiable function of each of the elements of $X$... $f(cdot)$ is the scalar derivative of $F(cdot)$. $X$ is here a matrix.



What is the scalar derivative? It is not defined in this document and I have issues to find a definition using Mister Google.



But the way, I'm puzzled by formula (100) of that document:
$$frac{partial}{partial X} mathsf{Tr}(XA) = A^T$$



$X mapsto {Tr}(XA)$ is a linear form defined on the matrices vector space and therefore it's derivative is itself everywhere



$$frac{partial}{partial X} mathsf{Tr}(XA).H = mathsf{Tr}(HA)$$



What is the link with $A^T$?










share|cite|improve this question









$endgroup$












  • $begingroup$
    Mister Wikipedia provides some help.
    $endgroup$
    – Paul Sinclair
    Jan 10 '18 at 0:33
















1












$begingroup$


I quote paragraph 2.5 of The Matrix Cookbook document: Assume $F(X)$ to be a differentiable function of each of the elements of $X$... $f(cdot)$ is the scalar derivative of $F(cdot)$. $X$ is here a matrix.



What is the scalar derivative? It is not defined in this document and I have issues to find a definition using Mister Google.



But the way, I'm puzzled by formula (100) of that document:
$$frac{partial}{partial X} mathsf{Tr}(XA) = A^T$$



$X mapsto {Tr}(XA)$ is a linear form defined on the matrices vector space and therefore it's derivative is itself everywhere



$$frac{partial}{partial X} mathsf{Tr}(XA).H = mathsf{Tr}(HA)$$



What is the link with $A^T$?










share|cite|improve this question









$endgroup$












  • $begingroup$
    Mister Wikipedia provides some help.
    $endgroup$
    – Paul Sinclair
    Jan 10 '18 at 0:33














1












1








1





$begingroup$


I quote paragraph 2.5 of The Matrix Cookbook document: Assume $F(X)$ to be a differentiable function of each of the elements of $X$... $f(cdot)$ is the scalar derivative of $F(cdot)$. $X$ is here a matrix.



What is the scalar derivative? It is not defined in this document and I have issues to find a definition using Mister Google.



But the way, I'm puzzled by formula (100) of that document:
$$frac{partial}{partial X} mathsf{Tr}(XA) = A^T$$



$X mapsto {Tr}(XA)$ is a linear form defined on the matrices vector space and therefore it's derivative is itself everywhere



$$frac{partial}{partial X} mathsf{Tr}(XA).H = mathsf{Tr}(HA)$$



What is the link with $A^T$?










share|cite|improve this question









$endgroup$




I quote paragraph 2.5 of The Matrix Cookbook document: Assume $F(X)$ to be a differentiable function of each of the elements of $X$... $f(cdot)$ is the scalar derivative of $F(cdot)$. $X$ is here a matrix.



What is the scalar derivative? It is not defined in this document and I have issues to find a definition using Mister Google.



But the way, I'm puzzled by formula (100) of that document:
$$frac{partial}{partial X} mathsf{Tr}(XA) = A^T$$



$X mapsto {Tr}(XA)$ is a linear form defined on the matrices vector space and therefore it's derivative is itself everywhere



$$frac{partial}{partial X} mathsf{Tr}(XA).H = mathsf{Tr}(HA)$$



What is the link with $A^T$?







matrices derivatives






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Jan 9 '18 at 18:35









mathcounterexamples.netmathcounterexamples.net

26.9k22157




26.9k22157












  • $begingroup$
    Mister Wikipedia provides some help.
    $endgroup$
    – Paul Sinclair
    Jan 10 '18 at 0:33


















  • $begingroup$
    Mister Wikipedia provides some help.
    $endgroup$
    – Paul Sinclair
    Jan 10 '18 at 0:33
















$begingroup$
Mister Wikipedia provides some help.
$endgroup$
– Paul Sinclair
Jan 10 '18 at 0:33




$begingroup$
Mister Wikipedia provides some help.
$endgroup$
– Paul Sinclair
Jan 10 '18 at 0:33










2 Answers
2






active

oldest

votes


















0












$begingroup$

For applied matrix calculus in deep learning the term 'scalar derivative' is used to explicitly confirm that the output of the partial derivative of the function with respect to a variable is a scalar and not a vector.



@mathcounterexamples.net See:




  • http://cs231n.stanford.edu/vecDerivs.pdf

  • https://arxiv.org/pdf/1802.01528.pdf






share|cite|improve this answer









$endgroup$





















    0












    $begingroup$

    The simplest explanation is that the word $scalar$ is a typo.



    The formula itself seem correct. For instance, let
    $$eqalign{
    F(x) &= sin(x) cr
    f(x) &= frac{dF}{dx} = cos(x) cr
    }$$
    Then, for a matrix argument $A$, one has the result
    $$eqalign{
    frac{partial,{rm Tr}(sin(A))}{partial A} &= cos(A)^T cr
    }$$
    ...or $cos(A)$ depending on which layout convention you prefer.






    share|cite|improve this answer









    $endgroup$













      Your Answer





      StackExchange.ifUsing("editor", function () {
      return StackExchange.using("mathjaxEditing", function () {
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
      });
      });
      }, "mathjax-editing");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "69"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      noCode: true, onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2598629%2fwhat-is-the-scalar-derivative%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0












      $begingroup$

      For applied matrix calculus in deep learning the term 'scalar derivative' is used to explicitly confirm that the output of the partial derivative of the function with respect to a variable is a scalar and not a vector.



      @mathcounterexamples.net See:




      • http://cs231n.stanford.edu/vecDerivs.pdf

      • https://arxiv.org/pdf/1802.01528.pdf






      share|cite|improve this answer









      $endgroup$


















        0












        $begingroup$

        For applied matrix calculus in deep learning the term 'scalar derivative' is used to explicitly confirm that the output of the partial derivative of the function with respect to a variable is a scalar and not a vector.



        @mathcounterexamples.net See:




        • http://cs231n.stanford.edu/vecDerivs.pdf

        • https://arxiv.org/pdf/1802.01528.pdf






        share|cite|improve this answer









        $endgroup$
















          0












          0








          0





          $begingroup$

          For applied matrix calculus in deep learning the term 'scalar derivative' is used to explicitly confirm that the output of the partial derivative of the function with respect to a variable is a scalar and not a vector.



          @mathcounterexamples.net See:




          • http://cs231n.stanford.edu/vecDerivs.pdf

          • https://arxiv.org/pdf/1802.01528.pdf






          share|cite|improve this answer









          $endgroup$



          For applied matrix calculus in deep learning the term 'scalar derivative' is used to explicitly confirm that the output of the partial derivative of the function with respect to a variable is a scalar and not a vector.



          @mathcounterexamples.net See:




          • http://cs231n.stanford.edu/vecDerivs.pdf

          • https://arxiv.org/pdf/1802.01528.pdf







          share|cite|improve this answer












          share|cite|improve this answer



          share|cite|improve this answer










          answered Dec 26 '18 at 14:30









          Matthew ArthurMatthew Arthur

          183




          183























              0












              $begingroup$

              The simplest explanation is that the word $scalar$ is a typo.



              The formula itself seem correct. For instance, let
              $$eqalign{
              F(x) &= sin(x) cr
              f(x) &= frac{dF}{dx} = cos(x) cr
              }$$
              Then, for a matrix argument $A$, one has the result
              $$eqalign{
              frac{partial,{rm Tr}(sin(A))}{partial A} &= cos(A)^T cr
              }$$
              ...or $cos(A)$ depending on which layout convention you prefer.






              share|cite|improve this answer









              $endgroup$


















                0












                $begingroup$

                The simplest explanation is that the word $scalar$ is a typo.



                The formula itself seem correct. For instance, let
                $$eqalign{
                F(x) &= sin(x) cr
                f(x) &= frac{dF}{dx} = cos(x) cr
                }$$
                Then, for a matrix argument $A$, one has the result
                $$eqalign{
                frac{partial,{rm Tr}(sin(A))}{partial A} &= cos(A)^T cr
                }$$
                ...or $cos(A)$ depending on which layout convention you prefer.






                share|cite|improve this answer









                $endgroup$
















                  0












                  0








                  0





                  $begingroup$

                  The simplest explanation is that the word $scalar$ is a typo.



                  The formula itself seem correct. For instance, let
                  $$eqalign{
                  F(x) &= sin(x) cr
                  f(x) &= frac{dF}{dx} = cos(x) cr
                  }$$
                  Then, for a matrix argument $A$, one has the result
                  $$eqalign{
                  frac{partial,{rm Tr}(sin(A))}{partial A} &= cos(A)^T cr
                  }$$
                  ...or $cos(A)$ depending on which layout convention you prefer.






                  share|cite|improve this answer









                  $endgroup$



                  The simplest explanation is that the word $scalar$ is a typo.



                  The formula itself seem correct. For instance, let
                  $$eqalign{
                  F(x) &= sin(x) cr
                  f(x) &= frac{dF}{dx} = cos(x) cr
                  }$$
                  Then, for a matrix argument $A$, one has the result
                  $$eqalign{
                  frac{partial,{rm Tr}(sin(A))}{partial A} &= cos(A)^T cr
                  }$$
                  ...or $cos(A)$ depending on which layout convention you prefer.







                  share|cite|improve this answer












                  share|cite|improve this answer



                  share|cite|improve this answer










                  answered Jan 10 '18 at 3:20









                  greggreg

                  8,5851823




                  8,5851823






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Mathematics Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2598629%2fwhat-is-the-scalar-derivative%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Quarter-circle Tiles

                      build a pushdown automaton that recognizes the reverse language of a given pushdown automaton?

                      Mont Emei