Distance between two Random Variables by comparing Cumulative Distribution Functions











up vote
2
down vote

favorite












Suppose $X$ and $Y$ are two random variables. Define the distance between $X$ and $Y$, $d(X, Y)$ as: $$d(X, Y) = int_{-infty}^{infty}left|mathbb{P}(X < t) - mathbb{P}(Y < t)right|dt$$
whenever this integral makes sense. Does this distance have a name? (Or, do you know of any similar constructions?) I am interested in examples for which the total variation distance is large but this distance is not so large.










share|cite|improve this question




























    up vote
    2
    down vote

    favorite












    Suppose $X$ and $Y$ are two random variables. Define the distance between $X$ and $Y$, $d(X, Y)$ as: $$d(X, Y) = int_{-infty}^{infty}left|mathbb{P}(X < t) - mathbb{P}(Y < t)right|dt$$
    whenever this integral makes sense. Does this distance have a name? (Or, do you know of any similar constructions?) I am interested in examples for which the total variation distance is large but this distance is not so large.










    share|cite|improve this question


























      up vote
      2
      down vote

      favorite









      up vote
      2
      down vote

      favorite











      Suppose $X$ and $Y$ are two random variables. Define the distance between $X$ and $Y$, $d(X, Y)$ as: $$d(X, Y) = int_{-infty}^{infty}left|mathbb{P}(X < t) - mathbb{P}(Y < t)right|dt$$
      whenever this integral makes sense. Does this distance have a name? (Or, do you know of any similar constructions?) I am interested in examples for which the total variation distance is large but this distance is not so large.










      share|cite|improve this question















      Suppose $X$ and $Y$ are two random variables. Define the distance between $X$ and $Y$, $d(X, Y)$ as: $$d(X, Y) = int_{-infty}^{infty}left|mathbb{P}(X < t) - mathbb{P}(Y < t)right|dt$$
      whenever this integral makes sense. Does this distance have a name? (Or, do you know of any similar constructions?) I am interested in examples for which the total variation distance is large but this distance is not so large.







      probability probability-distributions random-variables






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Jun 18 '13 at 14:27

























      asked Jun 18 '13 at 4:07









      AbleArcher

      59149




      59149






















          4 Answers
          4






          active

          oldest

          votes

















          up vote
          3
          down vote



          accepted










          It seems that, for any distributions $mu$ and $nu$,
          $$
          d(mu,nu)=inf{mathbb E(|X-Y|)mid mathbb P_X=mu,mathbb P_Y=nu}.
          $$
          This is called the Wasserstein distance (for the $L^1$ distance) , or the Monge-Kantorovich-Rubinstein metric, or some other name.



          By comparison, the total variation distance $d_{TV}$ is defined as
          $$
          d_{TV}(mu,nu)=inf{mathbb P(Xne Y)mid mathbb P_X=mu,mathbb P_Y=nu}.
          $$
          If $mu$ and $nu$ are measures on the integers, using the inequality $mathbb 1_{xne y}leqslant|x-y|$ for integers $(x,y)$, one sees that $d_{TV}leqslant d$ (but that no inequality $dleqslant ccdot d_{TV}$ can be valid).



          For measures on the real line, no inequality $d_{TV}leqslant ccdot d$ can be valid, as the example of Dirac measses at $x$ and $y$ shows, when $x-yto0$.






          share|cite|improve this answer





















          • Thanks for the info! I am unsure whether the Wasserstein distance is equivalent to the distance I alluded to (is this what you were suggesting by "it seems that"?)
            – AbleArcher
            Jun 20 '13 at 3:55










          • If I were you, I would try hard to show that your $d$ is indeed the Wasserstein distance.
            – Did
            Jun 20 '13 at 5:28










          • They may be equivalent, but I don't think they are the same, as evaluating them on a uniform distribution on $[0, 1)$ and $[1, 2)$ shows.
            – AbleArcher
            Jun 20 '13 at 14:43










          • For Uni(0,1) and Uni(1,2), one gets distance 1 with both formulas...
            – Did
            Jun 20 '13 at 16:24










          • You're right. My error.
            – AbleArcher
            Jun 20 '13 at 16:31




















          up vote
          1
          down vote













          It looks very close to what is called the total variation distance between two probability measures.






          share|cite|improve this answer





















          • Total variation distance seems similar, but may be quite different. Take the example of two random variables that have disjoint, but interlaced, supports. Total variation will be as large as possible, but the distance described above will be relatively small (compared to supports that are widely separated on R).
            – AbleArcher
            Jun 18 '13 at 14:36










          • @thethuthinnang What do you mean with "disjoint but $interlaced$" supports?
            – Avitus
            Jun 18 '13 at 14:47










          • @thethuthinnang Do you mean, that the support of each (or one of) random variable is disjoint but for both of them they interlace? Well, this is as close as I have seen in literature (I mean the total variation distance).
            – Caran-d'Ache
            Jun 18 '13 at 16:27








          • 1




            I think he means something like a situation where the supports of $X$ and $Y$ are $bigcup_{n=1}^infty (2n+1,2n+2)$ and $bigcup_{n=1}^infty (2n,2n+1)$ respectively.
            – Rookatu
            Jun 18 '13 at 19:01










          • I mean exactly the sort of situation Rookatu describes. If you add further that X is n on (2n + 1, 2n + 2), and Y is n on (2n, 2n + 1), with the measures of those intervals being the same under the corresponding probability measures, then we have a situation where two random variables are very similar, but total variation sees them as being as different as possible.
            – AbleArcher
            Jun 18 '13 at 21:14


















          up vote
          0
          down vote













          I do not know any special name for that function; in any case in order to satisfy the property



          $$d(X,Y)=0 Leftrightarrow X=Y$$



          one needs to consider the equivalent classes of continuous random variables which are equal in distribution. On other constructions: usually divergences are used to introduce distance-like measure of distances between random variables. You can check "Methods of Information Geometry" by Amari and Nagaoka for all constructions and definitions. If you are searching for a distance (in the pure mathematical sense), then probably you should have a look at Information Value and Hellinger distance.






          share|cite|improve this answer





















          • This is a distance, but between distributions, not between random variables.
            – Did
            Jun 18 '13 at 9:51










          • @Did: sorry, but I can not see that point; in the question it is stated that $X$, $Y$ are random variables, and the distance is $d(X,Y)$.
            – Avitus
            Jun 18 '13 at 9:56










          • I know what is stated in the question (since I can read). My comment explains what should have been written.
            – Did
            Jun 18 '13 at 9:58










          • @Did happy to $read$ that!
            – Avitus
            Jun 18 '13 at 10:04










          • I am interested in comparing distributions, not random variables. Feel free to edit the question to reflect that. I didn't know how to word it.
            – AbleArcher
            Jun 18 '13 at 14:43


















          up vote
          0
          down vote













          In the paper "Calculation of the Wasserstein Distance Between Probability Distributions on the Line", it is shown that your distance is precisely the Wasserstein metric with $p=1$, if your space is the real line.



          In particular, given two probability measures $P_X$ and $P_Y$ on $mathbb R$ (with corresponding CDFs $F_X$ and $F_Y$), the Wasserstein metric (with $p=1$) becomes:
          $$
          W_1(P_X,P_Y)=int_{-infty}^{+infty}|F_X(t)-F_Y(t)|,dt,
          $$

          which is exactly your measure. The result can be extended to probability measures defined on $mathbb R^n$.



          Moreover, if your space is bounded in $mathbb R$, $W_1$ metrizes weak convergence. That is, letting $rightharpoonup$ denote weak convergence, we have:
          $$
          P_nrightharpoonup Pqquadtext{if and only if}qquad lim_{nrightarrowinfty}W_1(P_n,P)=0.
          $$






          share|cite|improve this answer





















            Your Answer





            StackExchange.ifUsing("editor", function () {
            return StackExchange.using("mathjaxEditing", function () {
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            });
            });
            }, "mathjax-editing");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "69"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            noCode: true, onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f423327%2fdistance-between-two-random-variables-by-comparing-cumulative-distribution-funct%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            4 Answers
            4






            active

            oldest

            votes








            4 Answers
            4






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            3
            down vote



            accepted










            It seems that, for any distributions $mu$ and $nu$,
            $$
            d(mu,nu)=inf{mathbb E(|X-Y|)mid mathbb P_X=mu,mathbb P_Y=nu}.
            $$
            This is called the Wasserstein distance (for the $L^1$ distance) , or the Monge-Kantorovich-Rubinstein metric, or some other name.



            By comparison, the total variation distance $d_{TV}$ is defined as
            $$
            d_{TV}(mu,nu)=inf{mathbb P(Xne Y)mid mathbb P_X=mu,mathbb P_Y=nu}.
            $$
            If $mu$ and $nu$ are measures on the integers, using the inequality $mathbb 1_{xne y}leqslant|x-y|$ for integers $(x,y)$, one sees that $d_{TV}leqslant d$ (but that no inequality $dleqslant ccdot d_{TV}$ can be valid).



            For measures on the real line, no inequality $d_{TV}leqslant ccdot d$ can be valid, as the example of Dirac measses at $x$ and $y$ shows, when $x-yto0$.






            share|cite|improve this answer





















            • Thanks for the info! I am unsure whether the Wasserstein distance is equivalent to the distance I alluded to (is this what you were suggesting by "it seems that"?)
              – AbleArcher
              Jun 20 '13 at 3:55










            • If I were you, I would try hard to show that your $d$ is indeed the Wasserstein distance.
              – Did
              Jun 20 '13 at 5:28










            • They may be equivalent, but I don't think they are the same, as evaluating them on a uniform distribution on $[0, 1)$ and $[1, 2)$ shows.
              – AbleArcher
              Jun 20 '13 at 14:43










            • For Uni(0,1) and Uni(1,2), one gets distance 1 with both formulas...
              – Did
              Jun 20 '13 at 16:24










            • You're right. My error.
              – AbleArcher
              Jun 20 '13 at 16:31

















            up vote
            3
            down vote



            accepted










            It seems that, for any distributions $mu$ and $nu$,
            $$
            d(mu,nu)=inf{mathbb E(|X-Y|)mid mathbb P_X=mu,mathbb P_Y=nu}.
            $$
            This is called the Wasserstein distance (for the $L^1$ distance) , or the Monge-Kantorovich-Rubinstein metric, or some other name.



            By comparison, the total variation distance $d_{TV}$ is defined as
            $$
            d_{TV}(mu,nu)=inf{mathbb P(Xne Y)mid mathbb P_X=mu,mathbb P_Y=nu}.
            $$
            If $mu$ and $nu$ are measures on the integers, using the inequality $mathbb 1_{xne y}leqslant|x-y|$ for integers $(x,y)$, one sees that $d_{TV}leqslant d$ (but that no inequality $dleqslant ccdot d_{TV}$ can be valid).



            For measures on the real line, no inequality $d_{TV}leqslant ccdot d$ can be valid, as the example of Dirac measses at $x$ and $y$ shows, when $x-yto0$.






            share|cite|improve this answer





















            • Thanks for the info! I am unsure whether the Wasserstein distance is equivalent to the distance I alluded to (is this what you were suggesting by "it seems that"?)
              – AbleArcher
              Jun 20 '13 at 3:55










            • If I were you, I would try hard to show that your $d$ is indeed the Wasserstein distance.
              – Did
              Jun 20 '13 at 5:28










            • They may be equivalent, but I don't think they are the same, as evaluating them on a uniform distribution on $[0, 1)$ and $[1, 2)$ shows.
              – AbleArcher
              Jun 20 '13 at 14:43










            • For Uni(0,1) and Uni(1,2), one gets distance 1 with both formulas...
              – Did
              Jun 20 '13 at 16:24










            • You're right. My error.
              – AbleArcher
              Jun 20 '13 at 16:31















            up vote
            3
            down vote



            accepted







            up vote
            3
            down vote



            accepted






            It seems that, for any distributions $mu$ and $nu$,
            $$
            d(mu,nu)=inf{mathbb E(|X-Y|)mid mathbb P_X=mu,mathbb P_Y=nu}.
            $$
            This is called the Wasserstein distance (for the $L^1$ distance) , or the Monge-Kantorovich-Rubinstein metric, or some other name.



            By comparison, the total variation distance $d_{TV}$ is defined as
            $$
            d_{TV}(mu,nu)=inf{mathbb P(Xne Y)mid mathbb P_X=mu,mathbb P_Y=nu}.
            $$
            If $mu$ and $nu$ are measures on the integers, using the inequality $mathbb 1_{xne y}leqslant|x-y|$ for integers $(x,y)$, one sees that $d_{TV}leqslant d$ (but that no inequality $dleqslant ccdot d_{TV}$ can be valid).



            For measures on the real line, no inequality $d_{TV}leqslant ccdot d$ can be valid, as the example of Dirac measses at $x$ and $y$ shows, when $x-yto0$.






            share|cite|improve this answer












            It seems that, for any distributions $mu$ and $nu$,
            $$
            d(mu,nu)=inf{mathbb E(|X-Y|)mid mathbb P_X=mu,mathbb P_Y=nu}.
            $$
            This is called the Wasserstein distance (for the $L^1$ distance) , or the Monge-Kantorovich-Rubinstein metric, or some other name.



            By comparison, the total variation distance $d_{TV}$ is defined as
            $$
            d_{TV}(mu,nu)=inf{mathbb P(Xne Y)mid mathbb P_X=mu,mathbb P_Y=nu}.
            $$
            If $mu$ and $nu$ are measures on the integers, using the inequality $mathbb 1_{xne y}leqslant|x-y|$ for integers $(x,y)$, one sees that $d_{TV}leqslant d$ (but that no inequality $dleqslant ccdot d_{TV}$ can be valid).



            For measures on the real line, no inequality $d_{TV}leqslant ccdot d$ can be valid, as the example of Dirac measses at $x$ and $y$ shows, when $x-yto0$.







            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered Jun 19 '13 at 13:26









            Did

            245k23219453




            245k23219453












            • Thanks for the info! I am unsure whether the Wasserstein distance is equivalent to the distance I alluded to (is this what you were suggesting by "it seems that"?)
              – AbleArcher
              Jun 20 '13 at 3:55










            • If I were you, I would try hard to show that your $d$ is indeed the Wasserstein distance.
              – Did
              Jun 20 '13 at 5:28










            • They may be equivalent, but I don't think they are the same, as evaluating them on a uniform distribution on $[0, 1)$ and $[1, 2)$ shows.
              – AbleArcher
              Jun 20 '13 at 14:43










            • For Uni(0,1) and Uni(1,2), one gets distance 1 with both formulas...
              – Did
              Jun 20 '13 at 16:24










            • You're right. My error.
              – AbleArcher
              Jun 20 '13 at 16:31




















            • Thanks for the info! I am unsure whether the Wasserstein distance is equivalent to the distance I alluded to (is this what you were suggesting by "it seems that"?)
              – AbleArcher
              Jun 20 '13 at 3:55










            • If I were you, I would try hard to show that your $d$ is indeed the Wasserstein distance.
              – Did
              Jun 20 '13 at 5:28










            • They may be equivalent, but I don't think they are the same, as evaluating them on a uniform distribution on $[0, 1)$ and $[1, 2)$ shows.
              – AbleArcher
              Jun 20 '13 at 14:43










            • For Uni(0,1) and Uni(1,2), one gets distance 1 with both formulas...
              – Did
              Jun 20 '13 at 16:24










            • You're right. My error.
              – AbleArcher
              Jun 20 '13 at 16:31


















            Thanks for the info! I am unsure whether the Wasserstein distance is equivalent to the distance I alluded to (is this what you were suggesting by "it seems that"?)
            – AbleArcher
            Jun 20 '13 at 3:55




            Thanks for the info! I am unsure whether the Wasserstein distance is equivalent to the distance I alluded to (is this what you were suggesting by "it seems that"?)
            – AbleArcher
            Jun 20 '13 at 3:55












            If I were you, I would try hard to show that your $d$ is indeed the Wasserstein distance.
            – Did
            Jun 20 '13 at 5:28




            If I were you, I would try hard to show that your $d$ is indeed the Wasserstein distance.
            – Did
            Jun 20 '13 at 5:28












            They may be equivalent, but I don't think they are the same, as evaluating them on a uniform distribution on $[0, 1)$ and $[1, 2)$ shows.
            – AbleArcher
            Jun 20 '13 at 14:43




            They may be equivalent, but I don't think they are the same, as evaluating them on a uniform distribution on $[0, 1)$ and $[1, 2)$ shows.
            – AbleArcher
            Jun 20 '13 at 14:43












            For Uni(0,1) and Uni(1,2), one gets distance 1 with both formulas...
            – Did
            Jun 20 '13 at 16:24




            For Uni(0,1) and Uni(1,2), one gets distance 1 with both formulas...
            – Did
            Jun 20 '13 at 16:24












            You're right. My error.
            – AbleArcher
            Jun 20 '13 at 16:31






            You're right. My error.
            – AbleArcher
            Jun 20 '13 at 16:31












            up vote
            1
            down vote













            It looks very close to what is called the total variation distance between two probability measures.






            share|cite|improve this answer





















            • Total variation distance seems similar, but may be quite different. Take the example of two random variables that have disjoint, but interlaced, supports. Total variation will be as large as possible, but the distance described above will be relatively small (compared to supports that are widely separated on R).
              – AbleArcher
              Jun 18 '13 at 14:36










            • @thethuthinnang What do you mean with "disjoint but $interlaced$" supports?
              – Avitus
              Jun 18 '13 at 14:47










            • @thethuthinnang Do you mean, that the support of each (or one of) random variable is disjoint but for both of them they interlace? Well, this is as close as I have seen in literature (I mean the total variation distance).
              – Caran-d'Ache
              Jun 18 '13 at 16:27








            • 1




              I think he means something like a situation where the supports of $X$ and $Y$ are $bigcup_{n=1}^infty (2n+1,2n+2)$ and $bigcup_{n=1}^infty (2n,2n+1)$ respectively.
              – Rookatu
              Jun 18 '13 at 19:01










            • I mean exactly the sort of situation Rookatu describes. If you add further that X is n on (2n + 1, 2n + 2), and Y is n on (2n, 2n + 1), with the measures of those intervals being the same under the corresponding probability measures, then we have a situation where two random variables are very similar, but total variation sees them as being as different as possible.
              – AbleArcher
              Jun 18 '13 at 21:14















            up vote
            1
            down vote













            It looks very close to what is called the total variation distance between two probability measures.






            share|cite|improve this answer





















            • Total variation distance seems similar, but may be quite different. Take the example of two random variables that have disjoint, but interlaced, supports. Total variation will be as large as possible, but the distance described above will be relatively small (compared to supports that are widely separated on R).
              – AbleArcher
              Jun 18 '13 at 14:36










            • @thethuthinnang What do you mean with "disjoint but $interlaced$" supports?
              – Avitus
              Jun 18 '13 at 14:47










            • @thethuthinnang Do you mean, that the support of each (or one of) random variable is disjoint but for both of them they interlace? Well, this is as close as I have seen in literature (I mean the total variation distance).
              – Caran-d'Ache
              Jun 18 '13 at 16:27








            • 1




              I think he means something like a situation where the supports of $X$ and $Y$ are $bigcup_{n=1}^infty (2n+1,2n+2)$ and $bigcup_{n=1}^infty (2n,2n+1)$ respectively.
              – Rookatu
              Jun 18 '13 at 19:01










            • I mean exactly the sort of situation Rookatu describes. If you add further that X is n on (2n + 1, 2n + 2), and Y is n on (2n, 2n + 1), with the measures of those intervals being the same under the corresponding probability measures, then we have a situation where two random variables are very similar, but total variation sees them as being as different as possible.
              – AbleArcher
              Jun 18 '13 at 21:14













            up vote
            1
            down vote










            up vote
            1
            down vote









            It looks very close to what is called the total variation distance between two probability measures.






            share|cite|improve this answer












            It looks very close to what is called the total variation distance between two probability measures.







            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered Jun 18 '13 at 6:16









            Caran-d'Ache

            3,0831924




            3,0831924












            • Total variation distance seems similar, but may be quite different. Take the example of two random variables that have disjoint, but interlaced, supports. Total variation will be as large as possible, but the distance described above will be relatively small (compared to supports that are widely separated on R).
              – AbleArcher
              Jun 18 '13 at 14:36










            • @thethuthinnang What do you mean with "disjoint but $interlaced$" supports?
              – Avitus
              Jun 18 '13 at 14:47










            • @thethuthinnang Do you mean, that the support of each (or one of) random variable is disjoint but for both of them they interlace? Well, this is as close as I have seen in literature (I mean the total variation distance).
              – Caran-d'Ache
              Jun 18 '13 at 16:27








            • 1




              I think he means something like a situation where the supports of $X$ and $Y$ are $bigcup_{n=1}^infty (2n+1,2n+2)$ and $bigcup_{n=1}^infty (2n,2n+1)$ respectively.
              – Rookatu
              Jun 18 '13 at 19:01










            • I mean exactly the sort of situation Rookatu describes. If you add further that X is n on (2n + 1, 2n + 2), and Y is n on (2n, 2n + 1), with the measures of those intervals being the same under the corresponding probability measures, then we have a situation where two random variables are very similar, but total variation sees them as being as different as possible.
              – AbleArcher
              Jun 18 '13 at 21:14


















            • Total variation distance seems similar, but may be quite different. Take the example of two random variables that have disjoint, but interlaced, supports. Total variation will be as large as possible, but the distance described above will be relatively small (compared to supports that are widely separated on R).
              – AbleArcher
              Jun 18 '13 at 14:36










            • @thethuthinnang What do you mean with "disjoint but $interlaced$" supports?
              – Avitus
              Jun 18 '13 at 14:47










            • @thethuthinnang Do you mean, that the support of each (or one of) random variable is disjoint but for both of them they interlace? Well, this is as close as I have seen in literature (I mean the total variation distance).
              – Caran-d'Ache
              Jun 18 '13 at 16:27








            • 1




              I think he means something like a situation where the supports of $X$ and $Y$ are $bigcup_{n=1}^infty (2n+1,2n+2)$ and $bigcup_{n=1}^infty (2n,2n+1)$ respectively.
              – Rookatu
              Jun 18 '13 at 19:01










            • I mean exactly the sort of situation Rookatu describes. If you add further that X is n on (2n + 1, 2n + 2), and Y is n on (2n, 2n + 1), with the measures of those intervals being the same under the corresponding probability measures, then we have a situation where two random variables are very similar, but total variation sees them as being as different as possible.
              – AbleArcher
              Jun 18 '13 at 21:14
















            Total variation distance seems similar, but may be quite different. Take the example of two random variables that have disjoint, but interlaced, supports. Total variation will be as large as possible, but the distance described above will be relatively small (compared to supports that are widely separated on R).
            – AbleArcher
            Jun 18 '13 at 14:36




            Total variation distance seems similar, but may be quite different. Take the example of two random variables that have disjoint, but interlaced, supports. Total variation will be as large as possible, but the distance described above will be relatively small (compared to supports that are widely separated on R).
            – AbleArcher
            Jun 18 '13 at 14:36












            @thethuthinnang What do you mean with "disjoint but $interlaced$" supports?
            – Avitus
            Jun 18 '13 at 14:47




            @thethuthinnang What do you mean with "disjoint but $interlaced$" supports?
            – Avitus
            Jun 18 '13 at 14:47












            @thethuthinnang Do you mean, that the support of each (or one of) random variable is disjoint but for both of them they interlace? Well, this is as close as I have seen in literature (I mean the total variation distance).
            – Caran-d'Ache
            Jun 18 '13 at 16:27






            @thethuthinnang Do you mean, that the support of each (or one of) random variable is disjoint but for both of them they interlace? Well, this is as close as I have seen in literature (I mean the total variation distance).
            – Caran-d'Ache
            Jun 18 '13 at 16:27






            1




            1




            I think he means something like a situation where the supports of $X$ and $Y$ are $bigcup_{n=1}^infty (2n+1,2n+2)$ and $bigcup_{n=1}^infty (2n,2n+1)$ respectively.
            – Rookatu
            Jun 18 '13 at 19:01




            I think he means something like a situation where the supports of $X$ and $Y$ are $bigcup_{n=1}^infty (2n+1,2n+2)$ and $bigcup_{n=1}^infty (2n,2n+1)$ respectively.
            – Rookatu
            Jun 18 '13 at 19:01












            I mean exactly the sort of situation Rookatu describes. If you add further that X is n on (2n + 1, 2n + 2), and Y is n on (2n, 2n + 1), with the measures of those intervals being the same under the corresponding probability measures, then we have a situation where two random variables are very similar, but total variation sees them as being as different as possible.
            – AbleArcher
            Jun 18 '13 at 21:14




            I mean exactly the sort of situation Rookatu describes. If you add further that X is n on (2n + 1, 2n + 2), and Y is n on (2n, 2n + 1), with the measures of those intervals being the same under the corresponding probability measures, then we have a situation where two random variables are very similar, but total variation sees them as being as different as possible.
            – AbleArcher
            Jun 18 '13 at 21:14










            up vote
            0
            down vote













            I do not know any special name for that function; in any case in order to satisfy the property



            $$d(X,Y)=0 Leftrightarrow X=Y$$



            one needs to consider the equivalent classes of continuous random variables which are equal in distribution. On other constructions: usually divergences are used to introduce distance-like measure of distances between random variables. You can check "Methods of Information Geometry" by Amari and Nagaoka for all constructions and definitions. If you are searching for a distance (in the pure mathematical sense), then probably you should have a look at Information Value and Hellinger distance.






            share|cite|improve this answer





















            • This is a distance, but between distributions, not between random variables.
              – Did
              Jun 18 '13 at 9:51










            • @Did: sorry, but I can not see that point; in the question it is stated that $X$, $Y$ are random variables, and the distance is $d(X,Y)$.
              – Avitus
              Jun 18 '13 at 9:56










            • I know what is stated in the question (since I can read). My comment explains what should have been written.
              – Did
              Jun 18 '13 at 9:58










            • @Did happy to $read$ that!
              – Avitus
              Jun 18 '13 at 10:04










            • I am interested in comparing distributions, not random variables. Feel free to edit the question to reflect that. I didn't know how to word it.
              – AbleArcher
              Jun 18 '13 at 14:43















            up vote
            0
            down vote













            I do not know any special name for that function; in any case in order to satisfy the property



            $$d(X,Y)=0 Leftrightarrow X=Y$$



            one needs to consider the equivalent classes of continuous random variables which are equal in distribution. On other constructions: usually divergences are used to introduce distance-like measure of distances between random variables. You can check "Methods of Information Geometry" by Amari and Nagaoka for all constructions and definitions. If you are searching for a distance (in the pure mathematical sense), then probably you should have a look at Information Value and Hellinger distance.






            share|cite|improve this answer





















            • This is a distance, but between distributions, not between random variables.
              – Did
              Jun 18 '13 at 9:51










            • @Did: sorry, but I can not see that point; in the question it is stated that $X$, $Y$ are random variables, and the distance is $d(X,Y)$.
              – Avitus
              Jun 18 '13 at 9:56










            • I know what is stated in the question (since I can read). My comment explains what should have been written.
              – Did
              Jun 18 '13 at 9:58










            • @Did happy to $read$ that!
              – Avitus
              Jun 18 '13 at 10:04










            • I am interested in comparing distributions, not random variables. Feel free to edit the question to reflect that. I didn't know how to word it.
              – AbleArcher
              Jun 18 '13 at 14:43













            up vote
            0
            down vote










            up vote
            0
            down vote









            I do not know any special name for that function; in any case in order to satisfy the property



            $$d(X,Y)=0 Leftrightarrow X=Y$$



            one needs to consider the equivalent classes of continuous random variables which are equal in distribution. On other constructions: usually divergences are used to introduce distance-like measure of distances between random variables. You can check "Methods of Information Geometry" by Amari and Nagaoka for all constructions and definitions. If you are searching for a distance (in the pure mathematical sense), then probably you should have a look at Information Value and Hellinger distance.






            share|cite|improve this answer












            I do not know any special name for that function; in any case in order to satisfy the property



            $$d(X,Y)=0 Leftrightarrow X=Y$$



            one needs to consider the equivalent classes of continuous random variables which are equal in distribution. On other constructions: usually divergences are used to introduce distance-like measure of distances between random variables. You can check "Methods of Information Geometry" by Amari and Nagaoka for all constructions and definitions. If you are searching for a distance (in the pure mathematical sense), then probably you should have a look at Information Value and Hellinger distance.







            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered Jun 18 '13 at 7:46









            Avitus

            11.6k11840




            11.6k11840












            • This is a distance, but between distributions, not between random variables.
              – Did
              Jun 18 '13 at 9:51










            • @Did: sorry, but I can not see that point; in the question it is stated that $X$, $Y$ are random variables, and the distance is $d(X,Y)$.
              – Avitus
              Jun 18 '13 at 9:56










            • I know what is stated in the question (since I can read). My comment explains what should have been written.
              – Did
              Jun 18 '13 at 9:58










            • @Did happy to $read$ that!
              – Avitus
              Jun 18 '13 at 10:04










            • I am interested in comparing distributions, not random variables. Feel free to edit the question to reflect that. I didn't know how to word it.
              – AbleArcher
              Jun 18 '13 at 14:43


















            • This is a distance, but between distributions, not between random variables.
              – Did
              Jun 18 '13 at 9:51










            • @Did: sorry, but I can not see that point; in the question it is stated that $X$, $Y$ are random variables, and the distance is $d(X,Y)$.
              – Avitus
              Jun 18 '13 at 9:56










            • I know what is stated in the question (since I can read). My comment explains what should have been written.
              – Did
              Jun 18 '13 at 9:58










            • @Did happy to $read$ that!
              – Avitus
              Jun 18 '13 at 10:04










            • I am interested in comparing distributions, not random variables. Feel free to edit the question to reflect that. I didn't know how to word it.
              – AbleArcher
              Jun 18 '13 at 14:43
















            This is a distance, but between distributions, not between random variables.
            – Did
            Jun 18 '13 at 9:51




            This is a distance, but between distributions, not between random variables.
            – Did
            Jun 18 '13 at 9:51












            @Did: sorry, but I can not see that point; in the question it is stated that $X$, $Y$ are random variables, and the distance is $d(X,Y)$.
            – Avitus
            Jun 18 '13 at 9:56




            @Did: sorry, but I can not see that point; in the question it is stated that $X$, $Y$ are random variables, and the distance is $d(X,Y)$.
            – Avitus
            Jun 18 '13 at 9:56












            I know what is stated in the question (since I can read). My comment explains what should have been written.
            – Did
            Jun 18 '13 at 9:58




            I know what is stated in the question (since I can read). My comment explains what should have been written.
            – Did
            Jun 18 '13 at 9:58












            @Did happy to $read$ that!
            – Avitus
            Jun 18 '13 at 10:04




            @Did happy to $read$ that!
            – Avitus
            Jun 18 '13 at 10:04












            I am interested in comparing distributions, not random variables. Feel free to edit the question to reflect that. I didn't know how to word it.
            – AbleArcher
            Jun 18 '13 at 14:43




            I am interested in comparing distributions, not random variables. Feel free to edit the question to reflect that. I didn't know how to word it.
            – AbleArcher
            Jun 18 '13 at 14:43










            up vote
            0
            down vote













            In the paper "Calculation of the Wasserstein Distance Between Probability Distributions on the Line", it is shown that your distance is precisely the Wasserstein metric with $p=1$, if your space is the real line.



            In particular, given two probability measures $P_X$ and $P_Y$ on $mathbb R$ (with corresponding CDFs $F_X$ and $F_Y$), the Wasserstein metric (with $p=1$) becomes:
            $$
            W_1(P_X,P_Y)=int_{-infty}^{+infty}|F_X(t)-F_Y(t)|,dt,
            $$

            which is exactly your measure. The result can be extended to probability measures defined on $mathbb R^n$.



            Moreover, if your space is bounded in $mathbb R$, $W_1$ metrizes weak convergence. That is, letting $rightharpoonup$ denote weak convergence, we have:
            $$
            P_nrightharpoonup Pqquadtext{if and only if}qquad lim_{nrightarrowinfty}W_1(P_n,P)=0.
            $$






            share|cite|improve this answer

























              up vote
              0
              down vote













              In the paper "Calculation of the Wasserstein Distance Between Probability Distributions on the Line", it is shown that your distance is precisely the Wasserstein metric with $p=1$, if your space is the real line.



              In particular, given two probability measures $P_X$ and $P_Y$ on $mathbb R$ (with corresponding CDFs $F_X$ and $F_Y$), the Wasserstein metric (with $p=1$) becomes:
              $$
              W_1(P_X,P_Y)=int_{-infty}^{+infty}|F_X(t)-F_Y(t)|,dt,
              $$

              which is exactly your measure. The result can be extended to probability measures defined on $mathbb R^n$.



              Moreover, if your space is bounded in $mathbb R$, $W_1$ metrizes weak convergence. That is, letting $rightharpoonup$ denote weak convergence, we have:
              $$
              P_nrightharpoonup Pqquadtext{if and only if}qquad lim_{nrightarrowinfty}W_1(P_n,P)=0.
              $$






              share|cite|improve this answer























                up vote
                0
                down vote










                up vote
                0
                down vote









                In the paper "Calculation of the Wasserstein Distance Between Probability Distributions on the Line", it is shown that your distance is precisely the Wasserstein metric with $p=1$, if your space is the real line.



                In particular, given two probability measures $P_X$ and $P_Y$ on $mathbb R$ (with corresponding CDFs $F_X$ and $F_Y$), the Wasserstein metric (with $p=1$) becomes:
                $$
                W_1(P_X,P_Y)=int_{-infty}^{+infty}|F_X(t)-F_Y(t)|,dt,
                $$

                which is exactly your measure. The result can be extended to probability measures defined on $mathbb R^n$.



                Moreover, if your space is bounded in $mathbb R$, $W_1$ metrizes weak convergence. That is, letting $rightharpoonup$ denote weak convergence, we have:
                $$
                P_nrightharpoonup Pqquadtext{if and only if}qquad lim_{nrightarrowinfty}W_1(P_n,P)=0.
                $$






                share|cite|improve this answer












                In the paper "Calculation of the Wasserstein Distance Between Probability Distributions on the Line", it is shown that your distance is precisely the Wasserstein metric with $p=1$, if your space is the real line.



                In particular, given two probability measures $P_X$ and $P_Y$ on $mathbb R$ (with corresponding CDFs $F_X$ and $F_Y$), the Wasserstein metric (with $p=1$) becomes:
                $$
                W_1(P_X,P_Y)=int_{-infty}^{+infty}|F_X(t)-F_Y(t)|,dt,
                $$

                which is exactly your measure. The result can be extended to probability measures defined on $mathbb R^n$.



                Moreover, if your space is bounded in $mathbb R$, $W_1$ metrizes weak convergence. That is, letting $rightharpoonup$ denote weak convergence, we have:
                $$
                P_nrightharpoonup Pqquadtext{if and only if}qquad lim_{nrightarrowinfty}W_1(P_n,P)=0.
                $$







                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered Nov 22 at 11:02









                Coralio

                62




                62






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Mathematics Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f423327%2fdistance-between-two-random-variables-by-comparing-cumulative-distribution-funct%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Ellipse (mathématiques)

                    Quarter-circle Tiles

                    Mont Emei