The Median Minimizes the Sum of Absolute Deviations (The $ {L}_{1} $ Norm)











up vote
73
down vote

favorite
60













Suppose we have a set $S$ of real numbers. Show that
$$sum_{sin S}|s-x| $$
is minimal if $x$ is equal to the median.




This is a sample exam question of one of the exams that I need to take and I don't know how to proceed.










share|cite|improve this question




















  • 13




    Please replace THE median by ANY median.
    – Did
    Feb 25 '12 at 18:47

















up vote
73
down vote

favorite
60













Suppose we have a set $S$ of real numbers. Show that
$$sum_{sin S}|s-x| $$
is minimal if $x$ is equal to the median.




This is a sample exam question of one of the exams that I need to take and I don't know how to proceed.










share|cite|improve this question




















  • 13




    Please replace THE median by ANY median.
    – Did
    Feb 25 '12 at 18:47















up vote
73
down vote

favorite
60









up vote
73
down vote

favorite
60






60






Suppose we have a set $S$ of real numbers. Show that
$$sum_{sin S}|s-x| $$
is minimal if $x$ is equal to the median.




This is a sample exam question of one of the exams that I need to take and I don't know how to proceed.










share|cite|improve this question
















Suppose we have a set $S$ of real numbers. Show that
$$sum_{sin S}|s-x| $$
is minimal if $x$ is equal to the median.




This is a sample exam question of one of the exams that I need to take and I don't know how to proceed.







optimization absolute-value median






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Jun 1 at 17:16









Royi

3,30012351




3,30012351










asked Feb 25 '12 at 16:48









hattenn

5941611




5941611








  • 13




    Please replace THE median by ANY median.
    – Did
    Feb 25 '12 at 18:47
















  • 13




    Please replace THE median by ANY median.
    – Did
    Feb 25 '12 at 18:47










13




13




Please replace THE median by ANY median.
– Did
Feb 25 '12 at 18:47






Please replace THE median by ANY median.
– Did
Feb 25 '12 at 18:47












8 Answers
8






active

oldest

votes

















up vote
57
down vote



accepted










Introduction: The solution below is essentially the same as the solution given by Brian M. Scott, but it will take a lot longer. You are expected to assume that $S$ is a finite set. with say $k$ elements. Line them up in order, as $s_1<s_2<cdots <s_k$.



The situation is a little different when $k$ is odd than when $k$ is even. In particular, if $k$ is even there are (depending on the exact definition of median) many medians. We tell the story first for $k$ odd.

Recall that $|x-s_i|$ is the distance between $x$ and $s_i$, so we are trying to minimize the sum of the distances. For example, we have $k$ people who live at various points on the $x$-axis. We want to find the point(s) $x$ such that the sum of the travel distances of the $k$ people to $x$ is a minimum.



The story: Imagine that the $s_i$ are points on the $x$-axis. For clarity, take $k=7$. Start from well to the left of all the $s_i$, and take a tiny step, say of length $epsilon$, to the right. Then you have gotten $epsilon$ closer to every one of the $s_i$, so the sum of the distances has decreased by $7epsilon$.



Keep taking tiny steps to the right, each time getting a decrease of $7epsilon$. This continues until you hit $s_1$. If you now take a tiny step to the right, then your distance from $s_1$ increases by $epsilon$, and your distance from each of the remaining $s_i$ decreases by $epsilon$. What has happened to the sum of the distances? There is a decrease of $6epsilon$, and an increase of $epsilon$, for a net decrease of $5epsilon$ in the sum.



This continues until you hit $s_2$. Now, when you take a tiny step to the right, your distance from each of $s_1$ and $s_2$ increases by $epsilon$, and your distance from each of the five others decreases by $epsilon$, for a

net decrease of $3epsilon$.



This continues until you hit $s_3$. The next tiny step gives an increase of $3epsilon$, and a decrease of $4epsilon$, for a net decrease of $epsilon$.



This continues until you hit $s_4$. The next little step brings a total increase of $4epsilon$, and a total decrease of $3epsilon$, for an increase of $epsilon$. Things get even worse when you travel further to the right. So the minimum sum of distances is reached at $s_4$, the median.



The situation is quite similar if $k$ is even, say $k=6$. As you travel to the right, there is a net decrease at every step, until you hit $s_3$. When you are between $s_3$ and $s_4$, a tiny step of $epsilon$ increases your distance from each of $s_1$, $s_2$, and $s_3$ by $epsilon$. But it decreases your distance from each of the three others, for no net gain. Thus any $x$ in the interval from $s_3$ to $s_4$, including the endpoints, minimizes the sum of the distances. In the even case, I prefer to say that any point between the two "middle" points is a median. So the conclusion is that the points that minimize the sum are the medians. But some people prefer to define the median in the even case to be the average of the two "middle" points. Then the median does minimize the sum of the distances, but some other points also do.






share|cite|improve this answer






























    up vote
    37
    down vote













    We're basically after:
    $$ arg min_{x} sum_{i = 1}^{N} left| {s}_{i} - x right| $$



    One should notice that $ frac{mathrm{d} left | x right | }{mathrm{d} x} = operatorname{sign} left( x right) $ (Being more rigorous would say it is a Sub Gradient of the non smooth $ {L}_{1} $ Norm function).

    Hence, deriving the sum above yields $ sum_{i = 1}^{N} operatorname{sign} left( {s}_{i} - x right) $.

    This equals to zero only when the number of positive items equals the number of negative which happens when $ x = operatorname{median} left{ {s}_{1}, {s}_{2}, cdots, {s}_{N} right} $.



    One should notice that the median of a discrete group is not uniquely defined.

    Moreover, it is not necessarily an item within the group.






    share|cite|improve this answer



















    • 3




      Using derivatives here is overkill; the problem can be done by more elementary methods. $qquad$
      – Michael Hardy
      Oct 28 '16 at 17:06


















    up vote
    29
    down vote













    Suppose that the set $S$ has $n$ elements, $s_1<s_2<dots<s_n$. If $x<s_1$, then $$f(x)=sum_{sin S}|s-x|=sum_{sin S}(s-x)=sum_{k=1}^n(s_k-x);.tag{1}$$ As $x$ increases, each term of $(1)$ decreases until $x$ reaches $s_1$, therefore $f(s_1)<f(x)$ for all $x<s_1$.



    Now suppose that $s_kle xle x+dle s_{k+1}$. Then



    $$begin{align*}f(x+d)&=sum_{i=1}^kBig(x+d-s_iBig)+sum_{i=k+1}^nBig(s_i-(x+d)Big)\
    &=dk+sum_{i=1}^k(x-s_i)-d(n-k)+sum_{i=k+1}^n(s_i-x)\
    &=d(2k-n)+sum_{i=1}^k(x-s_i)+sum_{i=k+1}^n(s_i-x)\
    &=d(2k-n)+f(x);,
    end{align*}$$



    so $f(x+d)-f(x)=d(2k-n)$. This is negative if $2k<n$, zero if $2k=n$, and positive if $2k>n$. Thus, on the interval $[s_k,s_{k+1}]$



    $$f(x)text{ is }begin{cases}
    text{decreasing},&text{if }2k<n\
    text{constant},&text{if }2k=n\
    text{increasing},&text{if }2k>n;.
    end{cases}$$



    From here it shouldn’t be too hard to show that $f(x)$ is minimal when $x$ is the median of $S$.






    share|cite|improve this answer



















    • 3




      But a small typo. In "As x increases, each term of (1) decreases until x reaches x_1, so" You intended to say s_1 instead of x_1.
      – Neo M Hacker
      Sep 28 '17 at 3:23




















    up vote
    10
    down vote













    You want the median of $n$ numbers. Say $x$ is bigger than $12$ of them and smaller than $8$ of them (so $n=20$). If $x$ increases, it's getting closer to $8$ of the numbers and farther from $12$ of them, so the sum of the distances gets greater. And if $x$ decreases, it's getting closer to $12$ of them and farther from $8$ of them, so the sum of the distances gets smaller.



    A similar thing happens if $x$ is smaller than more of the $n$ numbers than $x$ is bigger than.



    But if $x$ is smaller than $10$ of them and bigger than $10$ of them, then when $x$ moves, it's getting farther from $10$ of them and closer to just as many of them, so the sum of the distances is not changing.



    So the sum of the distances is smallest when the number of data points less than $x$ is the same as the number of data points bigger than $x$.






    share|cite|improve this answer






























      up vote
      6
      down vote













      Starting with $$f(x)=sum_{i=1}^n |s_i-x|$$



      Assume we rearranged our terms such that $s_1<s_2<cdots<s_n$



      We first proceed by making the following observation $$sum_{i=1}^n |s_i-x| = sum_{i=2}^{n-1} |s_i-x| +(s_n -s_1) quad text{when} quad x in [s_1,s_n]$$



      Now suppose that $n$ is odd, then by applying the above identity repeatedly we get $$f(x)=sum_{i=1}^n |s_i-x|=|s_{frac{n+1}2}-x|+(s_n -s_1)+(s_{n-1}-s_2)+cdots+(s_{frac{n+3}2}-s_{frac{n-1}2})$$ or in other words $$f(x)=|s_{frac{n+1}{2}}-x|+text{constant}$$



      This is just the absolute value function with its vertex being at $(s_{frac{n+1}{2}},text{constant})$, the minimum of the absolute value function occurs at its vertex, therefore $s_{frac{n+1}{2}}$(median) minmizes $f(x)$.



      Now suppose $n$ is even, again by using our identity, we have $$f(x)=sum_{i=1}^n |s_i-x|=|s_{frac{n}2}-x|+|s_{frac{n+2}2}-x| + text{constant}$$



      Where the minimum occurs at $f'(x)=0$(or when not defined), therefore by differentiating and setting $f'(x)$ to zero we get $$dfrac{|s_{frac{n}{2}}-x|}{s_{frac{n}{2}}-x}+dfrac{|s_{frac{n+2}{2}}-x|}{s_{frac{n+2}{2}}-x}=0$$



      Observe that $s:=dfrac{s_{frac{n+2}{2}}+s_{frac{n}{2}}}{2}$(median) satisfies the above equation, since $s$ is halfway between $s_{frac{n}{2}}$ and $s_{frac{n+2}{2}}$ $$s_{frac{n}{2}}-s=-(s_{frac{n+2}{2}}-s)$$ that is by setting $x=s$ we get $$dfrac{|s_{frac{n}{2}}-s|}{s_{frac{n}{2}}-s}+dfrac{|s_{frac{n}{2}}-s|}{-(s_{frac{n}{2}}-s)}=0$$



      Therefore $s$ is a minimum.






      share|cite|improve this answer























      • I think some theory about minimum being where $f'(x) = 0$ for non differentiable functions is needed here
        – Guerlando OCs
        Aug 27 at 0:56




















      up vote
      3
      down vote













      Consider two $x_i$'s $x_1$ and $x_2$,



      For $x_1leq aleq x_2$, $sum_{i=1}^{2}|x_i-a|=|x_1-a|+|x_2-a|=a-x_1+x_2-a=x_2-x_1$



      For $alt x_1$, $sum_{i=1}^{2}|x_i-a|=x_1-a+x_2-a=x_1+x_2-2agt x_1+x_2-2x_1=x_2-x_1$



      For $agt x_2$,$sum_{i=1}^{2}|x_i-a|=-x_1+a-x_2+a=-x_1-x_2+2agt -x_1-x_2+2x_2=x_2-x_1$



      $implies$for any two $x_i$'s the sum of the absolute values of the deviations is minimum when $x_1leq aleq x_2$ or $ain[x_1,x_2]$.



      When $n$ is odd,
      $$
      sum_{i=1}^{n}|x_i-a|=|x_1-a|+|x_2-a|+....+|x_{tfrac{n-1}{2}}-a|+|x_{tfrac{n+1}{2}}-a|+|x_{tfrac{n+3}{2}}-a|+....+|x_{n-1}-a|+|x_n-a|
      $$
      consider the intervals $[x_1,x_n], [x_2,x_{n-1}], [x_3,x_{n-2}],....,[x_{tfrac{n-1}{2}},x_{tfrac{n+3}{2}}]$. If $a$ is a member of all these intervals. i.e,$[x_{tfrac{n-1}{2}},x_{tfrac{n+3}{2}}]$,



      using the above theorem, we can say that all the terms in the sum except $|x_{tfrac{n+1}{2}}-a|$ are minimized. So
      $$
      sum_{i=1}^{n}|x_i-a|=(x_n-x_1)+(x_{n-1}-x_2)+(x_{n-2}-x_3)+....+(x_{tfrac{n+3}{2}}-x_{tfrac{n-1}{2}})+|x_{tfrac{n+1}{2}}-a|=|x_{tfrac{n+1}{2}}-a|+text{costant}
      $$
      Now since the derivative of modulus function is signum function, $f'(a)=text{sgn}(x_{tfrac{n+1}{2}}-a)=0$ for $a=x_{tfrac{n+1}{2}}=text{Median}$



      $implies$ When $n$ is odd,the median minimizes the sum of absolute values of the deviations.



      When $n$ is even,
      $$
      sum_{i=1}^{n}|x_i-a|=|x_1-a|+|x_2-a|+....+|x_{tfrac{n}{2}}-a|+|x_{tfrac{n}{2}+1}-a|+....+|x_{n-1}-a|+|x_n-a|\
      $$
      If $a$ is a member of all the intervals $[x_1,x_n], [x_2,x_{n-1}], [x_3,x_{n-2}],....,[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$, i.e, $ain[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$,



      $$
      sum_{i=1}^{n}|x_i-a|=(x_n-x_1)+(x_{n-1}-x_2)+(x_{n-2}-x_3)+....+(x_{tfrac{n}{2}+1}-x_{tfrac{n}{2}})
      $$



      $implies$ When $n$ is even, any number in the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$, i.e, including the median, minimizes the sum of absolute values of the deviations. For example consider the series:$2, 4, 5, 10$, median, $M=4.5$.



      $$
      sum_{i=1}^4|x_i-M|=2.5+0.5+0.5+5.5=9
      $$
      If you take any other value in the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]=[4,5]$, say 4.1
      $$
      sum_{i=1}^4|x_i-4.1|=2.1+0.1+0.9+5.9=9
      $$
      For any value outside the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]=[4,5]$, say 5.2
      $$
      sum_{i=1}^4|x_i-5.2|=3.2+1.2+0.2+4.8=9.4
      $$






      share|cite|improve this answer






























        up vote
        1
        down vote













        Suppose $S$ is finite (with cardinal $s$), without repetitions, and ordered. Then the sum of absolute values is continuous (sum of continuous functions), and piecewise linear (hence differentiable), with left-most slope $-s$. By induction, the slope increases by 2 for each interval from left to right, with right-most slope $+s$. Hence the piece-wise slope first reaches either $-1$ or $0$ at index $leftlfloor frac{s+1}{2}rightrfloor$, and $0$ or $+1$ at index $leftlceil frac{s+1}{2}rightrceil$.



        Hence the function attains its minima in the interval $left[leftlfloor frac{s+1}{2}rightrfloor, leftlceil frac{s+1}{2}rightrceilright]$, which reduces to a singleton when $s$ is odd.



        The notion of median for continuous functions is detailed in Sunny Garlang Noah, The Median of a Continuous Function, Real Anal. Exchange, 2007






        share|cite|improve this answer






























          up vote
          1
          down vote













          Consider two real numbers $a<b$. Then the objective becomes



          $$dist(a,b) = |x-a|+|x-b|$$



          This expression is minimum when $aleq x leq b$. It can be proved by calculating the objective on 3 cases ($x<a, aleq xleq b, x>b$).



          Now consider the general case where $S$ has $n$ elements. Sort them in increasing order as $S_1, S_2, ldots, S_n$.



          Pair the smallest and the largest numbers. As explained above, $dist(S_1,S_n)$ is minimum when $S_1leq xleq S_n$. Remove these two elements from the list and continue this procedure until there is at most one element left in the set.



          If there is an element $S_i$ left, then $x=S_i$ minimizes $dist(x-S_i)$. It also lies between all the pairs.



          In the case of even elements, finally the sequence will be empty. As in the case above, median lies between all the pairs.






          share|cite|improve this answer























            Your Answer





            StackExchange.ifUsing("editor", function () {
            return StackExchange.using("mathjaxEditing", function () {
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            });
            });
            }, "mathjax-editing");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "69"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            noCode: true, onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f113270%2fthe-median-minimizes-the-sum-of-absolute-deviations-the-l-1-norm%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            8 Answers
            8






            active

            oldest

            votes








            8 Answers
            8






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            57
            down vote



            accepted










            Introduction: The solution below is essentially the same as the solution given by Brian M. Scott, but it will take a lot longer. You are expected to assume that $S$ is a finite set. with say $k$ elements. Line them up in order, as $s_1<s_2<cdots <s_k$.



            The situation is a little different when $k$ is odd than when $k$ is even. In particular, if $k$ is even there are (depending on the exact definition of median) many medians. We tell the story first for $k$ odd.

            Recall that $|x-s_i|$ is the distance between $x$ and $s_i$, so we are trying to minimize the sum of the distances. For example, we have $k$ people who live at various points on the $x$-axis. We want to find the point(s) $x$ such that the sum of the travel distances of the $k$ people to $x$ is a minimum.



            The story: Imagine that the $s_i$ are points on the $x$-axis. For clarity, take $k=7$. Start from well to the left of all the $s_i$, and take a tiny step, say of length $epsilon$, to the right. Then you have gotten $epsilon$ closer to every one of the $s_i$, so the sum of the distances has decreased by $7epsilon$.



            Keep taking tiny steps to the right, each time getting a decrease of $7epsilon$. This continues until you hit $s_1$. If you now take a tiny step to the right, then your distance from $s_1$ increases by $epsilon$, and your distance from each of the remaining $s_i$ decreases by $epsilon$. What has happened to the sum of the distances? There is a decrease of $6epsilon$, and an increase of $epsilon$, for a net decrease of $5epsilon$ in the sum.



            This continues until you hit $s_2$. Now, when you take a tiny step to the right, your distance from each of $s_1$ and $s_2$ increases by $epsilon$, and your distance from each of the five others decreases by $epsilon$, for a

            net decrease of $3epsilon$.



            This continues until you hit $s_3$. The next tiny step gives an increase of $3epsilon$, and a decrease of $4epsilon$, for a net decrease of $epsilon$.



            This continues until you hit $s_4$. The next little step brings a total increase of $4epsilon$, and a total decrease of $3epsilon$, for an increase of $epsilon$. Things get even worse when you travel further to the right. So the minimum sum of distances is reached at $s_4$, the median.



            The situation is quite similar if $k$ is even, say $k=6$. As you travel to the right, there is a net decrease at every step, until you hit $s_3$. When you are between $s_3$ and $s_4$, a tiny step of $epsilon$ increases your distance from each of $s_1$, $s_2$, and $s_3$ by $epsilon$. But it decreases your distance from each of the three others, for no net gain. Thus any $x$ in the interval from $s_3$ to $s_4$, including the endpoints, minimizes the sum of the distances. In the even case, I prefer to say that any point between the two "middle" points is a median. So the conclusion is that the points that minimize the sum are the medians. But some people prefer to define the median in the even case to be the average of the two "middle" points. Then the median does minimize the sum of the distances, but some other points also do.






            share|cite|improve this answer



























              up vote
              57
              down vote



              accepted










              Introduction: The solution below is essentially the same as the solution given by Brian M. Scott, but it will take a lot longer. You are expected to assume that $S$ is a finite set. with say $k$ elements. Line them up in order, as $s_1<s_2<cdots <s_k$.



              The situation is a little different when $k$ is odd than when $k$ is even. In particular, if $k$ is even there are (depending on the exact definition of median) many medians. We tell the story first for $k$ odd.

              Recall that $|x-s_i|$ is the distance between $x$ and $s_i$, so we are trying to minimize the sum of the distances. For example, we have $k$ people who live at various points on the $x$-axis. We want to find the point(s) $x$ such that the sum of the travel distances of the $k$ people to $x$ is a minimum.



              The story: Imagine that the $s_i$ are points on the $x$-axis. For clarity, take $k=7$. Start from well to the left of all the $s_i$, and take a tiny step, say of length $epsilon$, to the right. Then you have gotten $epsilon$ closer to every one of the $s_i$, so the sum of the distances has decreased by $7epsilon$.



              Keep taking tiny steps to the right, each time getting a decrease of $7epsilon$. This continues until you hit $s_1$. If you now take a tiny step to the right, then your distance from $s_1$ increases by $epsilon$, and your distance from each of the remaining $s_i$ decreases by $epsilon$. What has happened to the sum of the distances? There is a decrease of $6epsilon$, and an increase of $epsilon$, for a net decrease of $5epsilon$ in the sum.



              This continues until you hit $s_2$. Now, when you take a tiny step to the right, your distance from each of $s_1$ and $s_2$ increases by $epsilon$, and your distance from each of the five others decreases by $epsilon$, for a

              net decrease of $3epsilon$.



              This continues until you hit $s_3$. The next tiny step gives an increase of $3epsilon$, and a decrease of $4epsilon$, for a net decrease of $epsilon$.



              This continues until you hit $s_4$. The next little step brings a total increase of $4epsilon$, and a total decrease of $3epsilon$, for an increase of $epsilon$. Things get even worse when you travel further to the right. So the minimum sum of distances is reached at $s_4$, the median.



              The situation is quite similar if $k$ is even, say $k=6$. As you travel to the right, there is a net decrease at every step, until you hit $s_3$. When you are between $s_3$ and $s_4$, a tiny step of $epsilon$ increases your distance from each of $s_1$, $s_2$, and $s_3$ by $epsilon$. But it decreases your distance from each of the three others, for no net gain. Thus any $x$ in the interval from $s_3$ to $s_4$, including the endpoints, minimizes the sum of the distances. In the even case, I prefer to say that any point between the two "middle" points is a median. So the conclusion is that the points that minimize the sum are the medians. But some people prefer to define the median in the even case to be the average of the two "middle" points. Then the median does minimize the sum of the distances, but some other points also do.






              share|cite|improve this answer

























                up vote
                57
                down vote



                accepted







                up vote
                57
                down vote



                accepted






                Introduction: The solution below is essentially the same as the solution given by Brian M. Scott, but it will take a lot longer. You are expected to assume that $S$ is a finite set. with say $k$ elements. Line them up in order, as $s_1<s_2<cdots <s_k$.



                The situation is a little different when $k$ is odd than when $k$ is even. In particular, if $k$ is even there are (depending on the exact definition of median) many medians. We tell the story first for $k$ odd.

                Recall that $|x-s_i|$ is the distance between $x$ and $s_i$, so we are trying to minimize the sum of the distances. For example, we have $k$ people who live at various points on the $x$-axis. We want to find the point(s) $x$ such that the sum of the travel distances of the $k$ people to $x$ is a minimum.



                The story: Imagine that the $s_i$ are points on the $x$-axis. For clarity, take $k=7$. Start from well to the left of all the $s_i$, and take a tiny step, say of length $epsilon$, to the right. Then you have gotten $epsilon$ closer to every one of the $s_i$, so the sum of the distances has decreased by $7epsilon$.



                Keep taking tiny steps to the right, each time getting a decrease of $7epsilon$. This continues until you hit $s_1$. If you now take a tiny step to the right, then your distance from $s_1$ increases by $epsilon$, and your distance from each of the remaining $s_i$ decreases by $epsilon$. What has happened to the sum of the distances? There is a decrease of $6epsilon$, and an increase of $epsilon$, for a net decrease of $5epsilon$ in the sum.



                This continues until you hit $s_2$. Now, when you take a tiny step to the right, your distance from each of $s_1$ and $s_2$ increases by $epsilon$, and your distance from each of the five others decreases by $epsilon$, for a

                net decrease of $3epsilon$.



                This continues until you hit $s_3$. The next tiny step gives an increase of $3epsilon$, and a decrease of $4epsilon$, for a net decrease of $epsilon$.



                This continues until you hit $s_4$. The next little step brings a total increase of $4epsilon$, and a total decrease of $3epsilon$, for an increase of $epsilon$. Things get even worse when you travel further to the right. So the minimum sum of distances is reached at $s_4$, the median.



                The situation is quite similar if $k$ is even, say $k=6$. As you travel to the right, there is a net decrease at every step, until you hit $s_3$. When you are between $s_3$ and $s_4$, a tiny step of $epsilon$ increases your distance from each of $s_1$, $s_2$, and $s_3$ by $epsilon$. But it decreases your distance from each of the three others, for no net gain. Thus any $x$ in the interval from $s_3$ to $s_4$, including the endpoints, minimizes the sum of the distances. In the even case, I prefer to say that any point between the two "middle" points is a median. So the conclusion is that the points that minimize the sum are the medians. But some people prefer to define the median in the even case to be the average of the two "middle" points. Then the median does minimize the sum of the distances, but some other points also do.






                share|cite|improve this answer














                Introduction: The solution below is essentially the same as the solution given by Brian M. Scott, but it will take a lot longer. You are expected to assume that $S$ is a finite set. with say $k$ elements. Line them up in order, as $s_1<s_2<cdots <s_k$.



                The situation is a little different when $k$ is odd than when $k$ is even. In particular, if $k$ is even there are (depending on the exact definition of median) many medians. We tell the story first for $k$ odd.

                Recall that $|x-s_i|$ is the distance between $x$ and $s_i$, so we are trying to minimize the sum of the distances. For example, we have $k$ people who live at various points on the $x$-axis. We want to find the point(s) $x$ such that the sum of the travel distances of the $k$ people to $x$ is a minimum.



                The story: Imagine that the $s_i$ are points on the $x$-axis. For clarity, take $k=7$. Start from well to the left of all the $s_i$, and take a tiny step, say of length $epsilon$, to the right. Then you have gotten $epsilon$ closer to every one of the $s_i$, so the sum of the distances has decreased by $7epsilon$.



                Keep taking tiny steps to the right, each time getting a decrease of $7epsilon$. This continues until you hit $s_1$. If you now take a tiny step to the right, then your distance from $s_1$ increases by $epsilon$, and your distance from each of the remaining $s_i$ decreases by $epsilon$. What has happened to the sum of the distances? There is a decrease of $6epsilon$, and an increase of $epsilon$, for a net decrease of $5epsilon$ in the sum.



                This continues until you hit $s_2$. Now, when you take a tiny step to the right, your distance from each of $s_1$ and $s_2$ increases by $epsilon$, and your distance from each of the five others decreases by $epsilon$, for a

                net decrease of $3epsilon$.



                This continues until you hit $s_3$. The next tiny step gives an increase of $3epsilon$, and a decrease of $4epsilon$, for a net decrease of $epsilon$.



                This continues until you hit $s_4$. The next little step brings a total increase of $4epsilon$, and a total decrease of $3epsilon$, for an increase of $epsilon$. Things get even worse when you travel further to the right. So the minimum sum of distances is reached at $s_4$, the median.



                The situation is quite similar if $k$ is even, say $k=6$. As you travel to the right, there is a net decrease at every step, until you hit $s_3$. When you are between $s_3$ and $s_4$, a tiny step of $epsilon$ increases your distance from each of $s_1$, $s_2$, and $s_3$ by $epsilon$. But it decreases your distance from each of the three others, for no net gain. Thus any $x$ in the interval from $s_3$ to $s_4$, including the endpoints, minimizes the sum of the distances. In the even case, I prefer to say that any point between the two "middle" points is a median. So the conclusion is that the points that minimize the sum are the medians. But some people prefer to define the median in the even case to be the average of the two "middle" points. Then the median does minimize the sum of the distances, but some other points also do.







                share|cite|improve this answer














                share|cite|improve this answer



                share|cite|improve this answer








                edited Feb 25 '12 at 22:32









                Michael Hardy

                1




                1










                answered Feb 25 '12 at 20:37









                André Nicolas

                450k36421805




                450k36421805






















                    up vote
                    37
                    down vote













                    We're basically after:
                    $$ arg min_{x} sum_{i = 1}^{N} left| {s}_{i} - x right| $$



                    One should notice that $ frac{mathrm{d} left | x right | }{mathrm{d} x} = operatorname{sign} left( x right) $ (Being more rigorous would say it is a Sub Gradient of the non smooth $ {L}_{1} $ Norm function).

                    Hence, deriving the sum above yields $ sum_{i = 1}^{N} operatorname{sign} left( {s}_{i} - x right) $.

                    This equals to zero only when the number of positive items equals the number of negative which happens when $ x = operatorname{median} left{ {s}_{1}, {s}_{2}, cdots, {s}_{N} right} $.



                    One should notice that the median of a discrete group is not uniquely defined.

                    Moreover, it is not necessarily an item within the group.






                    share|cite|improve this answer



















                    • 3




                      Using derivatives here is overkill; the problem can be done by more elementary methods. $qquad$
                      – Michael Hardy
                      Oct 28 '16 at 17:06















                    up vote
                    37
                    down vote













                    We're basically after:
                    $$ arg min_{x} sum_{i = 1}^{N} left| {s}_{i} - x right| $$



                    One should notice that $ frac{mathrm{d} left | x right | }{mathrm{d} x} = operatorname{sign} left( x right) $ (Being more rigorous would say it is a Sub Gradient of the non smooth $ {L}_{1} $ Norm function).

                    Hence, deriving the sum above yields $ sum_{i = 1}^{N} operatorname{sign} left( {s}_{i} - x right) $.

                    This equals to zero only when the number of positive items equals the number of negative which happens when $ x = operatorname{median} left{ {s}_{1}, {s}_{2}, cdots, {s}_{N} right} $.



                    One should notice that the median of a discrete group is not uniquely defined.

                    Moreover, it is not necessarily an item within the group.






                    share|cite|improve this answer



















                    • 3




                      Using derivatives here is overkill; the problem can be done by more elementary methods. $qquad$
                      – Michael Hardy
                      Oct 28 '16 at 17:06













                    up vote
                    37
                    down vote










                    up vote
                    37
                    down vote









                    We're basically after:
                    $$ arg min_{x} sum_{i = 1}^{N} left| {s}_{i} - x right| $$



                    One should notice that $ frac{mathrm{d} left | x right | }{mathrm{d} x} = operatorname{sign} left( x right) $ (Being more rigorous would say it is a Sub Gradient of the non smooth $ {L}_{1} $ Norm function).

                    Hence, deriving the sum above yields $ sum_{i = 1}^{N} operatorname{sign} left( {s}_{i} - x right) $.

                    This equals to zero only when the number of positive items equals the number of negative which happens when $ x = operatorname{median} left{ {s}_{1}, {s}_{2}, cdots, {s}_{N} right} $.



                    One should notice that the median of a discrete group is not uniquely defined.

                    Moreover, it is not necessarily an item within the group.






                    share|cite|improve this answer














                    We're basically after:
                    $$ arg min_{x} sum_{i = 1}^{N} left| {s}_{i} - x right| $$



                    One should notice that $ frac{mathrm{d} left | x right | }{mathrm{d} x} = operatorname{sign} left( x right) $ (Being more rigorous would say it is a Sub Gradient of the non smooth $ {L}_{1} $ Norm function).

                    Hence, deriving the sum above yields $ sum_{i = 1}^{N} operatorname{sign} left( {s}_{i} - x right) $.

                    This equals to zero only when the number of positive items equals the number of negative which happens when $ x = operatorname{median} left{ {s}_{1}, {s}_{2}, cdots, {s}_{N} right} $.



                    One should notice that the median of a discrete group is not uniquely defined.

                    Moreover, it is not necessarily an item within the group.







                    share|cite|improve this answer














                    share|cite|improve this answer



                    share|cite|improve this answer








                    edited Jul 31 at 5:49

























                    answered Nov 16 '14 at 16:45









                    Royi

                    3,30012351




                    3,30012351








                    • 3




                      Using derivatives here is overkill; the problem can be done by more elementary methods. $qquad$
                      – Michael Hardy
                      Oct 28 '16 at 17:06














                    • 3




                      Using derivatives here is overkill; the problem can be done by more elementary methods. $qquad$
                      – Michael Hardy
                      Oct 28 '16 at 17:06








                    3




                    3




                    Using derivatives here is overkill; the problem can be done by more elementary methods. $qquad$
                    – Michael Hardy
                    Oct 28 '16 at 17:06




                    Using derivatives here is overkill; the problem can be done by more elementary methods. $qquad$
                    – Michael Hardy
                    Oct 28 '16 at 17:06










                    up vote
                    29
                    down vote













                    Suppose that the set $S$ has $n$ elements, $s_1<s_2<dots<s_n$. If $x<s_1$, then $$f(x)=sum_{sin S}|s-x|=sum_{sin S}(s-x)=sum_{k=1}^n(s_k-x);.tag{1}$$ As $x$ increases, each term of $(1)$ decreases until $x$ reaches $s_1$, therefore $f(s_1)<f(x)$ for all $x<s_1$.



                    Now suppose that $s_kle xle x+dle s_{k+1}$. Then



                    $$begin{align*}f(x+d)&=sum_{i=1}^kBig(x+d-s_iBig)+sum_{i=k+1}^nBig(s_i-(x+d)Big)\
                    &=dk+sum_{i=1}^k(x-s_i)-d(n-k)+sum_{i=k+1}^n(s_i-x)\
                    &=d(2k-n)+sum_{i=1}^k(x-s_i)+sum_{i=k+1}^n(s_i-x)\
                    &=d(2k-n)+f(x);,
                    end{align*}$$



                    so $f(x+d)-f(x)=d(2k-n)$. This is negative if $2k<n$, zero if $2k=n$, and positive if $2k>n$. Thus, on the interval $[s_k,s_{k+1}]$



                    $$f(x)text{ is }begin{cases}
                    text{decreasing},&text{if }2k<n\
                    text{constant},&text{if }2k=n\
                    text{increasing},&text{if }2k>n;.
                    end{cases}$$



                    From here it shouldn’t be too hard to show that $f(x)$ is minimal when $x$ is the median of $S$.






                    share|cite|improve this answer



















                    • 3




                      But a small typo. In "As x increases, each term of (1) decreases until x reaches x_1, so" You intended to say s_1 instead of x_1.
                      – Neo M Hacker
                      Sep 28 '17 at 3:23

















                    up vote
                    29
                    down vote













                    Suppose that the set $S$ has $n$ elements, $s_1<s_2<dots<s_n$. If $x<s_1$, then $$f(x)=sum_{sin S}|s-x|=sum_{sin S}(s-x)=sum_{k=1}^n(s_k-x);.tag{1}$$ As $x$ increases, each term of $(1)$ decreases until $x$ reaches $s_1$, therefore $f(s_1)<f(x)$ for all $x<s_1$.



                    Now suppose that $s_kle xle x+dle s_{k+1}$. Then



                    $$begin{align*}f(x+d)&=sum_{i=1}^kBig(x+d-s_iBig)+sum_{i=k+1}^nBig(s_i-(x+d)Big)\
                    &=dk+sum_{i=1}^k(x-s_i)-d(n-k)+sum_{i=k+1}^n(s_i-x)\
                    &=d(2k-n)+sum_{i=1}^k(x-s_i)+sum_{i=k+1}^n(s_i-x)\
                    &=d(2k-n)+f(x);,
                    end{align*}$$



                    so $f(x+d)-f(x)=d(2k-n)$. This is negative if $2k<n$, zero if $2k=n$, and positive if $2k>n$. Thus, on the interval $[s_k,s_{k+1}]$



                    $$f(x)text{ is }begin{cases}
                    text{decreasing},&text{if }2k<n\
                    text{constant},&text{if }2k=n\
                    text{increasing},&text{if }2k>n;.
                    end{cases}$$



                    From here it shouldn’t be too hard to show that $f(x)$ is minimal when $x$ is the median of $S$.






                    share|cite|improve this answer



















                    • 3




                      But a small typo. In "As x increases, each term of (1) decreases until x reaches x_1, so" You intended to say s_1 instead of x_1.
                      – Neo M Hacker
                      Sep 28 '17 at 3:23















                    up vote
                    29
                    down vote










                    up vote
                    29
                    down vote









                    Suppose that the set $S$ has $n$ elements, $s_1<s_2<dots<s_n$. If $x<s_1$, then $$f(x)=sum_{sin S}|s-x|=sum_{sin S}(s-x)=sum_{k=1}^n(s_k-x);.tag{1}$$ As $x$ increases, each term of $(1)$ decreases until $x$ reaches $s_1$, therefore $f(s_1)<f(x)$ for all $x<s_1$.



                    Now suppose that $s_kle xle x+dle s_{k+1}$. Then



                    $$begin{align*}f(x+d)&=sum_{i=1}^kBig(x+d-s_iBig)+sum_{i=k+1}^nBig(s_i-(x+d)Big)\
                    &=dk+sum_{i=1}^k(x-s_i)-d(n-k)+sum_{i=k+1}^n(s_i-x)\
                    &=d(2k-n)+sum_{i=1}^k(x-s_i)+sum_{i=k+1}^n(s_i-x)\
                    &=d(2k-n)+f(x);,
                    end{align*}$$



                    so $f(x+d)-f(x)=d(2k-n)$. This is negative if $2k<n$, zero if $2k=n$, and positive if $2k>n$. Thus, on the interval $[s_k,s_{k+1}]$



                    $$f(x)text{ is }begin{cases}
                    text{decreasing},&text{if }2k<n\
                    text{constant},&text{if }2k=n\
                    text{increasing},&text{if }2k>n;.
                    end{cases}$$



                    From here it shouldn’t be too hard to show that $f(x)$ is minimal when $x$ is the median of $S$.






                    share|cite|improve this answer














                    Suppose that the set $S$ has $n$ elements, $s_1<s_2<dots<s_n$. If $x<s_1$, then $$f(x)=sum_{sin S}|s-x|=sum_{sin S}(s-x)=sum_{k=1}^n(s_k-x);.tag{1}$$ As $x$ increases, each term of $(1)$ decreases until $x$ reaches $s_1$, therefore $f(s_1)<f(x)$ for all $x<s_1$.



                    Now suppose that $s_kle xle x+dle s_{k+1}$. Then



                    $$begin{align*}f(x+d)&=sum_{i=1}^kBig(x+d-s_iBig)+sum_{i=k+1}^nBig(s_i-(x+d)Big)\
                    &=dk+sum_{i=1}^k(x-s_i)-d(n-k)+sum_{i=k+1}^n(s_i-x)\
                    &=d(2k-n)+sum_{i=1}^k(x-s_i)+sum_{i=k+1}^n(s_i-x)\
                    &=d(2k-n)+f(x);,
                    end{align*}$$



                    so $f(x+d)-f(x)=d(2k-n)$. This is negative if $2k<n$, zero if $2k=n$, and positive if $2k>n$. Thus, on the interval $[s_k,s_{k+1}]$



                    $$f(x)text{ is }begin{cases}
                    text{decreasing},&text{if }2k<n\
                    text{constant},&text{if }2k=n\
                    text{increasing},&text{if }2k>n;.
                    end{cases}$$



                    From here it shouldn’t be too hard to show that $f(x)$ is minimal when $x$ is the median of $S$.







                    share|cite|improve this answer














                    share|cite|improve this answer



                    share|cite|improve this answer








                    edited Nov 21 at 12:01









                    pgmank

                    1034




                    1034










                    answered Feb 25 '12 at 17:22









                    Brian M. Scott

                    454k38505906




                    454k38505906








                    • 3




                      But a small typo. In "As x increases, each term of (1) decreases until x reaches x_1, so" You intended to say s_1 instead of x_1.
                      – Neo M Hacker
                      Sep 28 '17 at 3:23
















                    • 3




                      But a small typo. In "As x increases, each term of (1) decreases until x reaches x_1, so" You intended to say s_1 instead of x_1.
                      – Neo M Hacker
                      Sep 28 '17 at 3:23










                    3




                    3




                    But a small typo. In "As x increases, each term of (1) decreases until x reaches x_1, so" You intended to say s_1 instead of x_1.
                    – Neo M Hacker
                    Sep 28 '17 at 3:23






                    But a small typo. In "As x increases, each term of (1) decreases until x reaches x_1, so" You intended to say s_1 instead of x_1.
                    – Neo M Hacker
                    Sep 28 '17 at 3:23












                    up vote
                    10
                    down vote













                    You want the median of $n$ numbers. Say $x$ is bigger than $12$ of them and smaller than $8$ of them (so $n=20$). If $x$ increases, it's getting closer to $8$ of the numbers and farther from $12$ of them, so the sum of the distances gets greater. And if $x$ decreases, it's getting closer to $12$ of them and farther from $8$ of them, so the sum of the distances gets smaller.



                    A similar thing happens if $x$ is smaller than more of the $n$ numbers than $x$ is bigger than.



                    But if $x$ is smaller than $10$ of them and bigger than $10$ of them, then when $x$ moves, it's getting farther from $10$ of them and closer to just as many of them, so the sum of the distances is not changing.



                    So the sum of the distances is smallest when the number of data points less than $x$ is the same as the number of data points bigger than $x$.






                    share|cite|improve this answer



























                      up vote
                      10
                      down vote













                      You want the median of $n$ numbers. Say $x$ is bigger than $12$ of them and smaller than $8$ of them (so $n=20$). If $x$ increases, it's getting closer to $8$ of the numbers and farther from $12$ of them, so the sum of the distances gets greater. And if $x$ decreases, it's getting closer to $12$ of them and farther from $8$ of them, so the sum of the distances gets smaller.



                      A similar thing happens if $x$ is smaller than more of the $n$ numbers than $x$ is bigger than.



                      But if $x$ is smaller than $10$ of them and bigger than $10$ of them, then when $x$ moves, it's getting farther from $10$ of them and closer to just as many of them, so the sum of the distances is not changing.



                      So the sum of the distances is smallest when the number of data points less than $x$ is the same as the number of data points bigger than $x$.






                      share|cite|improve this answer

























                        up vote
                        10
                        down vote










                        up vote
                        10
                        down vote









                        You want the median of $n$ numbers. Say $x$ is bigger than $12$ of them and smaller than $8$ of them (so $n=20$). If $x$ increases, it's getting closer to $8$ of the numbers and farther from $12$ of them, so the sum of the distances gets greater. And if $x$ decreases, it's getting closer to $12$ of them and farther from $8$ of them, so the sum of the distances gets smaller.



                        A similar thing happens if $x$ is smaller than more of the $n$ numbers than $x$ is bigger than.



                        But if $x$ is smaller than $10$ of them and bigger than $10$ of them, then when $x$ moves, it's getting farther from $10$ of them and closer to just as many of them, so the sum of the distances is not changing.



                        So the sum of the distances is smallest when the number of data points less than $x$ is the same as the number of data points bigger than $x$.






                        share|cite|improve this answer














                        You want the median of $n$ numbers. Say $x$ is bigger than $12$ of them and smaller than $8$ of them (so $n=20$). If $x$ increases, it's getting closer to $8$ of the numbers and farther from $12$ of them, so the sum of the distances gets greater. And if $x$ decreases, it's getting closer to $12$ of them and farther from $8$ of them, so the sum of the distances gets smaller.



                        A similar thing happens if $x$ is smaller than more of the $n$ numbers than $x$ is bigger than.



                        But if $x$ is smaller than $10$ of them and bigger than $10$ of them, then when $x$ moves, it's getting farther from $10$ of them and closer to just as many of them, so the sum of the distances is not changing.



                        So the sum of the distances is smallest when the number of data points less than $x$ is the same as the number of data points bigger than $x$.







                        share|cite|improve this answer














                        share|cite|improve this answer



                        share|cite|improve this answer








                        edited Apr 12 '15 at 6:45









                        Lord_Farin

                        15.5k636108




                        15.5k636108










                        answered Feb 25 '12 at 22:37









                        Michael Hardy

                        1




                        1






















                            up vote
                            6
                            down vote













                            Starting with $$f(x)=sum_{i=1}^n |s_i-x|$$



                            Assume we rearranged our terms such that $s_1<s_2<cdots<s_n$



                            We first proceed by making the following observation $$sum_{i=1}^n |s_i-x| = sum_{i=2}^{n-1} |s_i-x| +(s_n -s_1) quad text{when} quad x in [s_1,s_n]$$



                            Now suppose that $n$ is odd, then by applying the above identity repeatedly we get $$f(x)=sum_{i=1}^n |s_i-x|=|s_{frac{n+1}2}-x|+(s_n -s_1)+(s_{n-1}-s_2)+cdots+(s_{frac{n+3}2}-s_{frac{n-1}2})$$ or in other words $$f(x)=|s_{frac{n+1}{2}}-x|+text{constant}$$



                            This is just the absolute value function with its vertex being at $(s_{frac{n+1}{2}},text{constant})$, the minimum of the absolute value function occurs at its vertex, therefore $s_{frac{n+1}{2}}$(median) minmizes $f(x)$.



                            Now suppose $n$ is even, again by using our identity, we have $$f(x)=sum_{i=1}^n |s_i-x|=|s_{frac{n}2}-x|+|s_{frac{n+2}2}-x| + text{constant}$$



                            Where the minimum occurs at $f'(x)=0$(or when not defined), therefore by differentiating and setting $f'(x)$ to zero we get $$dfrac{|s_{frac{n}{2}}-x|}{s_{frac{n}{2}}-x}+dfrac{|s_{frac{n+2}{2}}-x|}{s_{frac{n+2}{2}}-x}=0$$



                            Observe that $s:=dfrac{s_{frac{n+2}{2}}+s_{frac{n}{2}}}{2}$(median) satisfies the above equation, since $s$ is halfway between $s_{frac{n}{2}}$ and $s_{frac{n+2}{2}}$ $$s_{frac{n}{2}}-s=-(s_{frac{n+2}{2}}-s)$$ that is by setting $x=s$ we get $$dfrac{|s_{frac{n}{2}}-s|}{s_{frac{n}{2}}-s}+dfrac{|s_{frac{n}{2}}-s|}{-(s_{frac{n}{2}}-s)}=0$$



                            Therefore $s$ is a minimum.






                            share|cite|improve this answer























                            • I think some theory about minimum being where $f'(x) = 0$ for non differentiable functions is needed here
                              – Guerlando OCs
                              Aug 27 at 0:56

















                            up vote
                            6
                            down vote













                            Starting with $$f(x)=sum_{i=1}^n |s_i-x|$$



                            Assume we rearranged our terms such that $s_1<s_2<cdots<s_n$



                            We first proceed by making the following observation $$sum_{i=1}^n |s_i-x| = sum_{i=2}^{n-1} |s_i-x| +(s_n -s_1) quad text{when} quad x in [s_1,s_n]$$



                            Now suppose that $n$ is odd, then by applying the above identity repeatedly we get $$f(x)=sum_{i=1}^n |s_i-x|=|s_{frac{n+1}2}-x|+(s_n -s_1)+(s_{n-1}-s_2)+cdots+(s_{frac{n+3}2}-s_{frac{n-1}2})$$ or in other words $$f(x)=|s_{frac{n+1}{2}}-x|+text{constant}$$



                            This is just the absolute value function with its vertex being at $(s_{frac{n+1}{2}},text{constant})$, the minimum of the absolute value function occurs at its vertex, therefore $s_{frac{n+1}{2}}$(median) minmizes $f(x)$.



                            Now suppose $n$ is even, again by using our identity, we have $$f(x)=sum_{i=1}^n |s_i-x|=|s_{frac{n}2}-x|+|s_{frac{n+2}2}-x| + text{constant}$$



                            Where the minimum occurs at $f'(x)=0$(or when not defined), therefore by differentiating and setting $f'(x)$ to zero we get $$dfrac{|s_{frac{n}{2}}-x|}{s_{frac{n}{2}}-x}+dfrac{|s_{frac{n+2}{2}}-x|}{s_{frac{n+2}{2}}-x}=0$$



                            Observe that $s:=dfrac{s_{frac{n+2}{2}}+s_{frac{n}{2}}}{2}$(median) satisfies the above equation, since $s$ is halfway between $s_{frac{n}{2}}$ and $s_{frac{n+2}{2}}$ $$s_{frac{n}{2}}-s=-(s_{frac{n+2}{2}}-s)$$ that is by setting $x=s$ we get $$dfrac{|s_{frac{n}{2}}-s|}{s_{frac{n}{2}}-s}+dfrac{|s_{frac{n}{2}}-s|}{-(s_{frac{n}{2}}-s)}=0$$



                            Therefore $s$ is a minimum.






                            share|cite|improve this answer























                            • I think some theory about minimum being where $f'(x) = 0$ for non differentiable functions is needed here
                              – Guerlando OCs
                              Aug 27 at 0:56















                            up vote
                            6
                            down vote










                            up vote
                            6
                            down vote









                            Starting with $$f(x)=sum_{i=1}^n |s_i-x|$$



                            Assume we rearranged our terms such that $s_1<s_2<cdots<s_n$



                            We first proceed by making the following observation $$sum_{i=1}^n |s_i-x| = sum_{i=2}^{n-1} |s_i-x| +(s_n -s_1) quad text{when} quad x in [s_1,s_n]$$



                            Now suppose that $n$ is odd, then by applying the above identity repeatedly we get $$f(x)=sum_{i=1}^n |s_i-x|=|s_{frac{n+1}2}-x|+(s_n -s_1)+(s_{n-1}-s_2)+cdots+(s_{frac{n+3}2}-s_{frac{n-1}2})$$ or in other words $$f(x)=|s_{frac{n+1}{2}}-x|+text{constant}$$



                            This is just the absolute value function with its vertex being at $(s_{frac{n+1}{2}},text{constant})$, the minimum of the absolute value function occurs at its vertex, therefore $s_{frac{n+1}{2}}$(median) minmizes $f(x)$.



                            Now suppose $n$ is even, again by using our identity, we have $$f(x)=sum_{i=1}^n |s_i-x|=|s_{frac{n}2}-x|+|s_{frac{n+2}2}-x| + text{constant}$$



                            Where the minimum occurs at $f'(x)=0$(or when not defined), therefore by differentiating and setting $f'(x)$ to zero we get $$dfrac{|s_{frac{n}{2}}-x|}{s_{frac{n}{2}}-x}+dfrac{|s_{frac{n+2}{2}}-x|}{s_{frac{n+2}{2}}-x}=0$$



                            Observe that $s:=dfrac{s_{frac{n+2}{2}}+s_{frac{n}{2}}}{2}$(median) satisfies the above equation, since $s$ is halfway between $s_{frac{n}{2}}$ and $s_{frac{n+2}{2}}$ $$s_{frac{n}{2}}-s=-(s_{frac{n+2}{2}}-s)$$ that is by setting $x=s$ we get $$dfrac{|s_{frac{n}{2}}-s|}{s_{frac{n}{2}}-s}+dfrac{|s_{frac{n}{2}}-s|}{-(s_{frac{n}{2}}-s)}=0$$



                            Therefore $s$ is a minimum.






                            share|cite|improve this answer














                            Starting with $$f(x)=sum_{i=1}^n |s_i-x|$$



                            Assume we rearranged our terms such that $s_1<s_2<cdots<s_n$



                            We first proceed by making the following observation $$sum_{i=1}^n |s_i-x| = sum_{i=2}^{n-1} |s_i-x| +(s_n -s_1) quad text{when} quad x in [s_1,s_n]$$



                            Now suppose that $n$ is odd, then by applying the above identity repeatedly we get $$f(x)=sum_{i=1}^n |s_i-x|=|s_{frac{n+1}2}-x|+(s_n -s_1)+(s_{n-1}-s_2)+cdots+(s_{frac{n+3}2}-s_{frac{n-1}2})$$ or in other words $$f(x)=|s_{frac{n+1}{2}}-x|+text{constant}$$



                            This is just the absolute value function with its vertex being at $(s_{frac{n+1}{2}},text{constant})$, the minimum of the absolute value function occurs at its vertex, therefore $s_{frac{n+1}{2}}$(median) minmizes $f(x)$.



                            Now suppose $n$ is even, again by using our identity, we have $$f(x)=sum_{i=1}^n |s_i-x|=|s_{frac{n}2}-x|+|s_{frac{n+2}2}-x| + text{constant}$$



                            Where the minimum occurs at $f'(x)=0$(or when not defined), therefore by differentiating and setting $f'(x)$ to zero we get $$dfrac{|s_{frac{n}{2}}-x|}{s_{frac{n}{2}}-x}+dfrac{|s_{frac{n+2}{2}}-x|}{s_{frac{n+2}{2}}-x}=0$$



                            Observe that $s:=dfrac{s_{frac{n+2}{2}}+s_{frac{n}{2}}}{2}$(median) satisfies the above equation, since $s$ is halfway between $s_{frac{n}{2}}$ and $s_{frac{n+2}{2}}$ $$s_{frac{n}{2}}-s=-(s_{frac{n+2}{2}}-s)$$ that is by setting $x=s$ we get $$dfrac{|s_{frac{n}{2}}-s|}{s_{frac{n}{2}}-s}+dfrac{|s_{frac{n}{2}}-s|}{-(s_{frac{n}{2}}-s)}=0$$



                            Therefore $s$ is a minimum.







                            share|cite|improve this answer














                            share|cite|improve this answer



                            share|cite|improve this answer








                            edited Mar 25 '17 at 1:16









                            Michael Hardy

                            1




                            1










                            answered Jul 17 '16 at 22:01









                            Omar Nagib

                            603514




                            603514












                            • I think some theory about minimum being where $f'(x) = 0$ for non differentiable functions is needed here
                              – Guerlando OCs
                              Aug 27 at 0:56




















                            • I think some theory about minimum being where $f'(x) = 0$ for non differentiable functions is needed here
                              – Guerlando OCs
                              Aug 27 at 0:56


















                            I think some theory about minimum being where $f'(x) = 0$ for non differentiable functions is needed here
                            – Guerlando OCs
                            Aug 27 at 0:56






                            I think some theory about minimum being where $f'(x) = 0$ for non differentiable functions is needed here
                            – Guerlando OCs
                            Aug 27 at 0:56












                            up vote
                            3
                            down vote













                            Consider two $x_i$'s $x_1$ and $x_2$,



                            For $x_1leq aleq x_2$, $sum_{i=1}^{2}|x_i-a|=|x_1-a|+|x_2-a|=a-x_1+x_2-a=x_2-x_1$



                            For $alt x_1$, $sum_{i=1}^{2}|x_i-a|=x_1-a+x_2-a=x_1+x_2-2agt x_1+x_2-2x_1=x_2-x_1$



                            For $agt x_2$,$sum_{i=1}^{2}|x_i-a|=-x_1+a-x_2+a=-x_1-x_2+2agt -x_1-x_2+2x_2=x_2-x_1$



                            $implies$for any two $x_i$'s the sum of the absolute values of the deviations is minimum when $x_1leq aleq x_2$ or $ain[x_1,x_2]$.



                            When $n$ is odd,
                            $$
                            sum_{i=1}^{n}|x_i-a|=|x_1-a|+|x_2-a|+....+|x_{tfrac{n-1}{2}}-a|+|x_{tfrac{n+1}{2}}-a|+|x_{tfrac{n+3}{2}}-a|+....+|x_{n-1}-a|+|x_n-a|
                            $$
                            consider the intervals $[x_1,x_n], [x_2,x_{n-1}], [x_3,x_{n-2}],....,[x_{tfrac{n-1}{2}},x_{tfrac{n+3}{2}}]$. If $a$ is a member of all these intervals. i.e,$[x_{tfrac{n-1}{2}},x_{tfrac{n+3}{2}}]$,



                            using the above theorem, we can say that all the terms in the sum except $|x_{tfrac{n+1}{2}}-a|$ are minimized. So
                            $$
                            sum_{i=1}^{n}|x_i-a|=(x_n-x_1)+(x_{n-1}-x_2)+(x_{n-2}-x_3)+....+(x_{tfrac{n+3}{2}}-x_{tfrac{n-1}{2}})+|x_{tfrac{n+1}{2}}-a|=|x_{tfrac{n+1}{2}}-a|+text{costant}
                            $$
                            Now since the derivative of modulus function is signum function, $f'(a)=text{sgn}(x_{tfrac{n+1}{2}}-a)=0$ for $a=x_{tfrac{n+1}{2}}=text{Median}$



                            $implies$ When $n$ is odd,the median minimizes the sum of absolute values of the deviations.



                            When $n$ is even,
                            $$
                            sum_{i=1}^{n}|x_i-a|=|x_1-a|+|x_2-a|+....+|x_{tfrac{n}{2}}-a|+|x_{tfrac{n}{2}+1}-a|+....+|x_{n-1}-a|+|x_n-a|\
                            $$
                            If $a$ is a member of all the intervals $[x_1,x_n], [x_2,x_{n-1}], [x_3,x_{n-2}],....,[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$, i.e, $ain[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$,



                            $$
                            sum_{i=1}^{n}|x_i-a|=(x_n-x_1)+(x_{n-1}-x_2)+(x_{n-2}-x_3)+....+(x_{tfrac{n}{2}+1}-x_{tfrac{n}{2}})
                            $$



                            $implies$ When $n$ is even, any number in the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$, i.e, including the median, minimizes the sum of absolute values of the deviations. For example consider the series:$2, 4, 5, 10$, median, $M=4.5$.



                            $$
                            sum_{i=1}^4|x_i-M|=2.5+0.5+0.5+5.5=9
                            $$
                            If you take any other value in the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]=[4,5]$, say 4.1
                            $$
                            sum_{i=1}^4|x_i-4.1|=2.1+0.1+0.9+5.9=9
                            $$
                            For any value outside the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]=[4,5]$, say 5.2
                            $$
                            sum_{i=1}^4|x_i-5.2|=3.2+1.2+0.2+4.8=9.4
                            $$






                            share|cite|improve this answer



























                              up vote
                              3
                              down vote













                              Consider two $x_i$'s $x_1$ and $x_2$,



                              For $x_1leq aleq x_2$, $sum_{i=1}^{2}|x_i-a|=|x_1-a|+|x_2-a|=a-x_1+x_2-a=x_2-x_1$



                              For $alt x_1$, $sum_{i=1}^{2}|x_i-a|=x_1-a+x_2-a=x_1+x_2-2agt x_1+x_2-2x_1=x_2-x_1$



                              For $agt x_2$,$sum_{i=1}^{2}|x_i-a|=-x_1+a-x_2+a=-x_1-x_2+2agt -x_1-x_2+2x_2=x_2-x_1$



                              $implies$for any two $x_i$'s the sum of the absolute values of the deviations is minimum when $x_1leq aleq x_2$ or $ain[x_1,x_2]$.



                              When $n$ is odd,
                              $$
                              sum_{i=1}^{n}|x_i-a|=|x_1-a|+|x_2-a|+....+|x_{tfrac{n-1}{2}}-a|+|x_{tfrac{n+1}{2}}-a|+|x_{tfrac{n+3}{2}}-a|+....+|x_{n-1}-a|+|x_n-a|
                              $$
                              consider the intervals $[x_1,x_n], [x_2,x_{n-1}], [x_3,x_{n-2}],....,[x_{tfrac{n-1}{2}},x_{tfrac{n+3}{2}}]$. If $a$ is a member of all these intervals. i.e,$[x_{tfrac{n-1}{2}},x_{tfrac{n+3}{2}}]$,



                              using the above theorem, we can say that all the terms in the sum except $|x_{tfrac{n+1}{2}}-a|$ are minimized. So
                              $$
                              sum_{i=1}^{n}|x_i-a|=(x_n-x_1)+(x_{n-1}-x_2)+(x_{n-2}-x_3)+....+(x_{tfrac{n+3}{2}}-x_{tfrac{n-1}{2}})+|x_{tfrac{n+1}{2}}-a|=|x_{tfrac{n+1}{2}}-a|+text{costant}
                              $$
                              Now since the derivative of modulus function is signum function, $f'(a)=text{sgn}(x_{tfrac{n+1}{2}}-a)=0$ for $a=x_{tfrac{n+1}{2}}=text{Median}$



                              $implies$ When $n$ is odd,the median minimizes the sum of absolute values of the deviations.



                              When $n$ is even,
                              $$
                              sum_{i=1}^{n}|x_i-a|=|x_1-a|+|x_2-a|+....+|x_{tfrac{n}{2}}-a|+|x_{tfrac{n}{2}+1}-a|+....+|x_{n-1}-a|+|x_n-a|\
                              $$
                              If $a$ is a member of all the intervals $[x_1,x_n], [x_2,x_{n-1}], [x_3,x_{n-2}],....,[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$, i.e, $ain[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$,



                              $$
                              sum_{i=1}^{n}|x_i-a|=(x_n-x_1)+(x_{n-1}-x_2)+(x_{n-2}-x_3)+....+(x_{tfrac{n}{2}+1}-x_{tfrac{n}{2}})
                              $$



                              $implies$ When $n$ is even, any number in the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$, i.e, including the median, minimizes the sum of absolute values of the deviations. For example consider the series:$2, 4, 5, 10$, median, $M=4.5$.



                              $$
                              sum_{i=1}^4|x_i-M|=2.5+0.5+0.5+5.5=9
                              $$
                              If you take any other value in the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]=[4,5]$, say 4.1
                              $$
                              sum_{i=1}^4|x_i-4.1|=2.1+0.1+0.9+5.9=9
                              $$
                              For any value outside the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]=[4,5]$, say 5.2
                              $$
                              sum_{i=1}^4|x_i-5.2|=3.2+1.2+0.2+4.8=9.4
                              $$






                              share|cite|improve this answer

























                                up vote
                                3
                                down vote










                                up vote
                                3
                                down vote









                                Consider two $x_i$'s $x_1$ and $x_2$,



                                For $x_1leq aleq x_2$, $sum_{i=1}^{2}|x_i-a|=|x_1-a|+|x_2-a|=a-x_1+x_2-a=x_2-x_1$



                                For $alt x_1$, $sum_{i=1}^{2}|x_i-a|=x_1-a+x_2-a=x_1+x_2-2agt x_1+x_2-2x_1=x_2-x_1$



                                For $agt x_2$,$sum_{i=1}^{2}|x_i-a|=-x_1+a-x_2+a=-x_1-x_2+2agt -x_1-x_2+2x_2=x_2-x_1$



                                $implies$for any two $x_i$'s the sum of the absolute values of the deviations is minimum when $x_1leq aleq x_2$ or $ain[x_1,x_2]$.



                                When $n$ is odd,
                                $$
                                sum_{i=1}^{n}|x_i-a|=|x_1-a|+|x_2-a|+....+|x_{tfrac{n-1}{2}}-a|+|x_{tfrac{n+1}{2}}-a|+|x_{tfrac{n+3}{2}}-a|+....+|x_{n-1}-a|+|x_n-a|
                                $$
                                consider the intervals $[x_1,x_n], [x_2,x_{n-1}], [x_3,x_{n-2}],....,[x_{tfrac{n-1}{2}},x_{tfrac{n+3}{2}}]$. If $a$ is a member of all these intervals. i.e,$[x_{tfrac{n-1}{2}},x_{tfrac{n+3}{2}}]$,



                                using the above theorem, we can say that all the terms in the sum except $|x_{tfrac{n+1}{2}}-a|$ are minimized. So
                                $$
                                sum_{i=1}^{n}|x_i-a|=(x_n-x_1)+(x_{n-1}-x_2)+(x_{n-2}-x_3)+....+(x_{tfrac{n+3}{2}}-x_{tfrac{n-1}{2}})+|x_{tfrac{n+1}{2}}-a|=|x_{tfrac{n+1}{2}}-a|+text{costant}
                                $$
                                Now since the derivative of modulus function is signum function, $f'(a)=text{sgn}(x_{tfrac{n+1}{2}}-a)=0$ for $a=x_{tfrac{n+1}{2}}=text{Median}$



                                $implies$ When $n$ is odd,the median minimizes the sum of absolute values of the deviations.



                                When $n$ is even,
                                $$
                                sum_{i=1}^{n}|x_i-a|=|x_1-a|+|x_2-a|+....+|x_{tfrac{n}{2}}-a|+|x_{tfrac{n}{2}+1}-a|+....+|x_{n-1}-a|+|x_n-a|\
                                $$
                                If $a$ is a member of all the intervals $[x_1,x_n], [x_2,x_{n-1}], [x_3,x_{n-2}],....,[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$, i.e, $ain[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$,



                                $$
                                sum_{i=1}^{n}|x_i-a|=(x_n-x_1)+(x_{n-1}-x_2)+(x_{n-2}-x_3)+....+(x_{tfrac{n}{2}+1}-x_{tfrac{n}{2}})
                                $$



                                $implies$ When $n$ is even, any number in the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$, i.e, including the median, minimizes the sum of absolute values of the deviations. For example consider the series:$2, 4, 5, 10$, median, $M=4.5$.



                                $$
                                sum_{i=1}^4|x_i-M|=2.5+0.5+0.5+5.5=9
                                $$
                                If you take any other value in the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]=[4,5]$, say 4.1
                                $$
                                sum_{i=1}^4|x_i-4.1|=2.1+0.1+0.9+5.9=9
                                $$
                                For any value outside the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]=[4,5]$, say 5.2
                                $$
                                sum_{i=1}^4|x_i-5.2|=3.2+1.2+0.2+4.8=9.4
                                $$






                                share|cite|improve this answer














                                Consider two $x_i$'s $x_1$ and $x_2$,



                                For $x_1leq aleq x_2$, $sum_{i=1}^{2}|x_i-a|=|x_1-a|+|x_2-a|=a-x_1+x_2-a=x_2-x_1$



                                For $alt x_1$, $sum_{i=1}^{2}|x_i-a|=x_1-a+x_2-a=x_1+x_2-2agt x_1+x_2-2x_1=x_2-x_1$



                                For $agt x_2$,$sum_{i=1}^{2}|x_i-a|=-x_1+a-x_2+a=-x_1-x_2+2agt -x_1-x_2+2x_2=x_2-x_1$



                                $implies$for any two $x_i$'s the sum of the absolute values of the deviations is minimum when $x_1leq aleq x_2$ or $ain[x_1,x_2]$.



                                When $n$ is odd,
                                $$
                                sum_{i=1}^{n}|x_i-a|=|x_1-a|+|x_2-a|+....+|x_{tfrac{n-1}{2}}-a|+|x_{tfrac{n+1}{2}}-a|+|x_{tfrac{n+3}{2}}-a|+....+|x_{n-1}-a|+|x_n-a|
                                $$
                                consider the intervals $[x_1,x_n], [x_2,x_{n-1}], [x_3,x_{n-2}],....,[x_{tfrac{n-1}{2}},x_{tfrac{n+3}{2}}]$. If $a$ is a member of all these intervals. i.e,$[x_{tfrac{n-1}{2}},x_{tfrac{n+3}{2}}]$,



                                using the above theorem, we can say that all the terms in the sum except $|x_{tfrac{n+1}{2}}-a|$ are minimized. So
                                $$
                                sum_{i=1}^{n}|x_i-a|=(x_n-x_1)+(x_{n-1}-x_2)+(x_{n-2}-x_3)+....+(x_{tfrac{n+3}{2}}-x_{tfrac{n-1}{2}})+|x_{tfrac{n+1}{2}}-a|=|x_{tfrac{n+1}{2}}-a|+text{costant}
                                $$
                                Now since the derivative of modulus function is signum function, $f'(a)=text{sgn}(x_{tfrac{n+1}{2}}-a)=0$ for $a=x_{tfrac{n+1}{2}}=text{Median}$



                                $implies$ When $n$ is odd,the median minimizes the sum of absolute values of the deviations.



                                When $n$ is even,
                                $$
                                sum_{i=1}^{n}|x_i-a|=|x_1-a|+|x_2-a|+....+|x_{tfrac{n}{2}}-a|+|x_{tfrac{n}{2}+1}-a|+....+|x_{n-1}-a|+|x_n-a|\
                                $$
                                If $a$ is a member of all the intervals $[x_1,x_n], [x_2,x_{n-1}], [x_3,x_{n-2}],....,[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$, i.e, $ain[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$,



                                $$
                                sum_{i=1}^{n}|x_i-a|=(x_n-x_1)+(x_{n-1}-x_2)+(x_{n-2}-x_3)+....+(x_{tfrac{n}{2}+1}-x_{tfrac{n}{2}})
                                $$



                                $implies$ When $n$ is even, any number in the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]$, i.e, including the median, minimizes the sum of absolute values of the deviations. For example consider the series:$2, 4, 5, 10$, median, $M=4.5$.



                                $$
                                sum_{i=1}^4|x_i-M|=2.5+0.5+0.5+5.5=9
                                $$
                                If you take any other value in the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]=[4,5]$, say 4.1
                                $$
                                sum_{i=1}^4|x_i-4.1|=2.1+0.1+0.9+5.9=9
                                $$
                                For any value outside the interval $[x_{tfrac{n}{2}},x_{tfrac{n}{2}+1}]=[4,5]$, say 5.2
                                $$
                                sum_{i=1}^4|x_i-5.2|=3.2+1.2+0.2+4.8=9.4
                                $$







                                share|cite|improve this answer














                                share|cite|improve this answer



                                share|cite|improve this answer








                                edited Jul 20 '17 at 14:59

























                                answered Jul 20 '17 at 11:32









                                ss1729

                                1,753722




                                1,753722






















                                    up vote
                                    1
                                    down vote













                                    Suppose $S$ is finite (with cardinal $s$), without repetitions, and ordered. Then the sum of absolute values is continuous (sum of continuous functions), and piecewise linear (hence differentiable), with left-most slope $-s$. By induction, the slope increases by 2 for each interval from left to right, with right-most slope $+s$. Hence the piece-wise slope first reaches either $-1$ or $0$ at index $leftlfloor frac{s+1}{2}rightrfloor$, and $0$ or $+1$ at index $leftlceil frac{s+1}{2}rightrceil$.



                                    Hence the function attains its minima in the interval $left[leftlfloor frac{s+1}{2}rightrfloor, leftlceil frac{s+1}{2}rightrceilright]$, which reduces to a singleton when $s$ is odd.



                                    The notion of median for continuous functions is detailed in Sunny Garlang Noah, The Median of a Continuous Function, Real Anal. Exchange, 2007






                                    share|cite|improve this answer



























                                      up vote
                                      1
                                      down vote













                                      Suppose $S$ is finite (with cardinal $s$), without repetitions, and ordered. Then the sum of absolute values is continuous (sum of continuous functions), and piecewise linear (hence differentiable), with left-most slope $-s$. By induction, the slope increases by 2 for each interval from left to right, with right-most slope $+s$. Hence the piece-wise slope first reaches either $-1$ or $0$ at index $leftlfloor frac{s+1}{2}rightrfloor$, and $0$ or $+1$ at index $leftlceil frac{s+1}{2}rightrceil$.



                                      Hence the function attains its minima in the interval $left[leftlfloor frac{s+1}{2}rightrfloor, leftlceil frac{s+1}{2}rightrceilright]$, which reduces to a singleton when $s$ is odd.



                                      The notion of median for continuous functions is detailed in Sunny Garlang Noah, The Median of a Continuous Function, Real Anal. Exchange, 2007






                                      share|cite|improve this answer

























                                        up vote
                                        1
                                        down vote










                                        up vote
                                        1
                                        down vote









                                        Suppose $S$ is finite (with cardinal $s$), without repetitions, and ordered. Then the sum of absolute values is continuous (sum of continuous functions), and piecewise linear (hence differentiable), with left-most slope $-s$. By induction, the slope increases by 2 for each interval from left to right, with right-most slope $+s$. Hence the piece-wise slope first reaches either $-1$ or $0$ at index $leftlfloor frac{s+1}{2}rightrfloor$, and $0$ or $+1$ at index $leftlceil frac{s+1}{2}rightrceil$.



                                        Hence the function attains its minima in the interval $left[leftlfloor frac{s+1}{2}rightrfloor, leftlceil frac{s+1}{2}rightrceilright]$, which reduces to a singleton when $s$ is odd.



                                        The notion of median for continuous functions is detailed in Sunny Garlang Noah, The Median of a Continuous Function, Real Anal. Exchange, 2007






                                        share|cite|improve this answer














                                        Suppose $S$ is finite (with cardinal $s$), without repetitions, and ordered. Then the sum of absolute values is continuous (sum of continuous functions), and piecewise linear (hence differentiable), with left-most slope $-s$. By induction, the slope increases by 2 for each interval from left to right, with right-most slope $+s$. Hence the piece-wise slope first reaches either $-1$ or $0$ at index $leftlfloor frac{s+1}{2}rightrfloor$, and $0$ or $+1$ at index $leftlceil frac{s+1}{2}rightrceil$.



                                        Hence the function attains its minima in the interval $left[leftlfloor frac{s+1}{2}rightrfloor, leftlceil frac{s+1}{2}rightrceilright]$, which reduces to a singleton when $s$ is odd.



                                        The notion of median for continuous functions is detailed in Sunny Garlang Noah, The Median of a Continuous Function, Real Anal. Exchange, 2007







                                        share|cite|improve this answer














                                        share|cite|improve this answer



                                        share|cite|improve this answer








                                        edited Dec 1 '15 at 13:16

























                                        answered Dec 1 '15 at 11:42









                                        Laurent Duval

                                        5,26811239




                                        5,26811239






















                                            up vote
                                            1
                                            down vote













                                            Consider two real numbers $a<b$. Then the objective becomes



                                            $$dist(a,b) = |x-a|+|x-b|$$



                                            This expression is minimum when $aleq x leq b$. It can be proved by calculating the objective on 3 cases ($x<a, aleq xleq b, x>b$).



                                            Now consider the general case where $S$ has $n$ elements. Sort them in increasing order as $S_1, S_2, ldots, S_n$.



                                            Pair the smallest and the largest numbers. As explained above, $dist(S_1,S_n)$ is minimum when $S_1leq xleq S_n$. Remove these two elements from the list and continue this procedure until there is at most one element left in the set.



                                            If there is an element $S_i$ left, then $x=S_i$ minimizes $dist(x-S_i)$. It also lies between all the pairs.



                                            In the case of even elements, finally the sequence will be empty. As in the case above, median lies between all the pairs.






                                            share|cite|improve this answer



























                                              up vote
                                              1
                                              down vote













                                              Consider two real numbers $a<b$. Then the objective becomes



                                              $$dist(a,b) = |x-a|+|x-b|$$



                                              This expression is minimum when $aleq x leq b$. It can be proved by calculating the objective on 3 cases ($x<a, aleq xleq b, x>b$).



                                              Now consider the general case where $S$ has $n$ elements. Sort them in increasing order as $S_1, S_2, ldots, S_n$.



                                              Pair the smallest and the largest numbers. As explained above, $dist(S_1,S_n)$ is minimum when $S_1leq xleq S_n$. Remove these two elements from the list and continue this procedure until there is at most one element left in the set.



                                              If there is an element $S_i$ left, then $x=S_i$ minimizes $dist(x-S_i)$. It also lies between all the pairs.



                                              In the case of even elements, finally the sequence will be empty. As in the case above, median lies between all the pairs.






                                              share|cite|improve this answer

























                                                up vote
                                                1
                                                down vote










                                                up vote
                                                1
                                                down vote









                                                Consider two real numbers $a<b$. Then the objective becomes



                                                $$dist(a,b) = |x-a|+|x-b|$$



                                                This expression is minimum when $aleq x leq b$. It can be proved by calculating the objective on 3 cases ($x<a, aleq xleq b, x>b$).



                                                Now consider the general case where $S$ has $n$ elements. Sort them in increasing order as $S_1, S_2, ldots, S_n$.



                                                Pair the smallest and the largest numbers. As explained above, $dist(S_1,S_n)$ is minimum when $S_1leq xleq S_n$. Remove these two elements from the list and continue this procedure until there is at most one element left in the set.



                                                If there is an element $S_i$ left, then $x=S_i$ minimizes $dist(x-S_i)$. It also lies between all the pairs.



                                                In the case of even elements, finally the sequence will be empty. As in the case above, median lies between all the pairs.






                                                share|cite|improve this answer














                                                Consider two real numbers $a<b$. Then the objective becomes



                                                $$dist(a,b) = |x-a|+|x-b|$$



                                                This expression is minimum when $aleq x leq b$. It can be proved by calculating the objective on 3 cases ($x<a, aleq xleq b, x>b$).



                                                Now consider the general case where $S$ has $n$ elements. Sort them in increasing order as $S_1, S_2, ldots, S_n$.



                                                Pair the smallest and the largest numbers. As explained above, $dist(S_1,S_n)$ is minimum when $S_1leq xleq S_n$. Remove these two elements from the list and continue this procedure until there is at most one element left in the set.



                                                If there is an element $S_i$ left, then $x=S_i$ minimizes $dist(x-S_i)$. It also lies between all the pairs.



                                                In the case of even elements, finally the sequence will be empty. As in the case above, median lies between all the pairs.







                                                share|cite|improve this answer














                                                share|cite|improve this answer



                                                share|cite|improve this answer








                                                edited Oct 27 '17 at 5:08

























                                                answered Oct 27 '17 at 4:57









                                                foo

                                                113




                                                113






























                                                    draft saved

                                                    draft discarded




















































                                                    Thanks for contributing an answer to Mathematics Stack Exchange!


                                                    • Please be sure to answer the question. Provide details and share your research!

                                                    But avoid



                                                    • Asking for help, clarification, or responding to other answers.

                                                    • Making statements based on opinion; back them up with references or personal experience.


                                                    Use MathJax to format equations. MathJax reference.


                                                    To learn more, see our tips on writing great answers.





                                                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                                    Please pay close attention to the following guidance:


                                                    • Please be sure to answer the question. Provide details and share your research!

                                                    But avoid



                                                    • Asking for help, clarification, or responding to other answers.

                                                    • Making statements based on opinion; back them up with references or personal experience.


                                                    To learn more, see our tips on writing great answers.




                                                    draft saved


                                                    draft discarded














                                                    StackExchange.ready(
                                                    function () {
                                                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f113270%2fthe-median-minimizes-the-sum-of-absolute-deviations-the-l-1-norm%23new-answer', 'question_page');
                                                    }
                                                    );

                                                    Post as a guest















                                                    Required, but never shown





















































                                                    Required, but never shown














                                                    Required, but never shown












                                                    Required, but never shown







                                                    Required, but never shown

































                                                    Required, but never shown














                                                    Required, but never shown












                                                    Required, but never shown







                                                    Required, but never shown







                                                    Popular posts from this blog

                                                    Mont Emei

                                                    Province de Neuquén

                                                    Journaliste