Split a set keeping the proportion of its elements on each subset











up vote
0
down vote

favorite












I have a set of elements. To simplify here, these elements are $+$ and $-$.



I need to split this set into n sets keeping the proportion of $+$ and $-$. These new sets need to have similar size: their size must differ in one or zero.



For example, if I have a set with 7 $+$ and 4 $-$, and I need to split it into 5 sets. Each of these sets will have: 3 elements, 2 elements, 2 elements, 2 elements and 2 elements.



To solve this, I have done the following:




  1. Calculate the percentage of $+$ elements: $p = 4/11 = 0.37 => 37%$

  2. Calculate the percentage of $-$ elements: $n = 7/11 = 0.63 => 63%$


Now I have the percentages, I use the following formulas to calculate how many $+$ and $-$ elements I will have on each subset.



If $m$ is the number of elements on a subset, I calculate:



$mp = m * p$
$mn = m * n$



Where $mp$ is the number of $+$ elements, and $mn$ is the number of $-$ elements on each subset.



Those numbers always have decimals, so I need to round then.



If $mp > mn$ I do:



$mp = lceil mprceil$
$mn = lfloor mn rfloor $



And, if $mp < mn$ I do:



$mp = lfloor mp rfloor $
$mn = lceil mnrceil$



But it doesn't work in this example, because when I do:



$mp = 2 * 0.37 = 0.74$
$mn = 2 * 0.63 = 1.26$



I get than $mn > mp$, so:



$mp = lfloor 0.74 rfloor = 0 $
$mn = lceil 1.26 rceil = 2$



I don't use any of the $+$ elements in four subsets. This is an error.



My problem is I don't know what to do with the decimals.



How can I do to keep the proportion and avoid these kind of problems?










share|cite|improve this question


















  • 1




    Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
    – hardmath
    Nov 21 at 2:33















up vote
0
down vote

favorite












I have a set of elements. To simplify here, these elements are $+$ and $-$.



I need to split this set into n sets keeping the proportion of $+$ and $-$. These new sets need to have similar size: their size must differ in one or zero.



For example, if I have a set with 7 $+$ and 4 $-$, and I need to split it into 5 sets. Each of these sets will have: 3 elements, 2 elements, 2 elements, 2 elements and 2 elements.



To solve this, I have done the following:




  1. Calculate the percentage of $+$ elements: $p = 4/11 = 0.37 => 37%$

  2. Calculate the percentage of $-$ elements: $n = 7/11 = 0.63 => 63%$


Now I have the percentages, I use the following formulas to calculate how many $+$ and $-$ elements I will have on each subset.



If $m$ is the number of elements on a subset, I calculate:



$mp = m * p$
$mn = m * n$



Where $mp$ is the number of $+$ elements, and $mn$ is the number of $-$ elements on each subset.



Those numbers always have decimals, so I need to round then.



If $mp > mn$ I do:



$mp = lceil mprceil$
$mn = lfloor mn rfloor $



And, if $mp < mn$ I do:



$mp = lfloor mp rfloor $
$mn = lceil mnrceil$



But it doesn't work in this example, because when I do:



$mp = 2 * 0.37 = 0.74$
$mn = 2 * 0.63 = 1.26$



I get than $mn > mp$, so:



$mp = lfloor 0.74 rfloor = 0 $
$mn = lceil 1.26 rceil = 2$



I don't use any of the $+$ elements in four subsets. This is an error.



My problem is I don't know what to do with the decimals.



How can I do to keep the proportion and avoid these kind of problems?










share|cite|improve this question


















  • 1




    Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
    – hardmath
    Nov 21 at 2:33













up vote
0
down vote

favorite









up vote
0
down vote

favorite











I have a set of elements. To simplify here, these elements are $+$ and $-$.



I need to split this set into n sets keeping the proportion of $+$ and $-$. These new sets need to have similar size: their size must differ in one or zero.



For example, if I have a set with 7 $+$ and 4 $-$, and I need to split it into 5 sets. Each of these sets will have: 3 elements, 2 elements, 2 elements, 2 elements and 2 elements.



To solve this, I have done the following:




  1. Calculate the percentage of $+$ elements: $p = 4/11 = 0.37 => 37%$

  2. Calculate the percentage of $-$ elements: $n = 7/11 = 0.63 => 63%$


Now I have the percentages, I use the following formulas to calculate how many $+$ and $-$ elements I will have on each subset.



If $m$ is the number of elements on a subset, I calculate:



$mp = m * p$
$mn = m * n$



Where $mp$ is the number of $+$ elements, and $mn$ is the number of $-$ elements on each subset.



Those numbers always have decimals, so I need to round then.



If $mp > mn$ I do:



$mp = lceil mprceil$
$mn = lfloor mn rfloor $



And, if $mp < mn$ I do:



$mp = lfloor mp rfloor $
$mn = lceil mnrceil$



But it doesn't work in this example, because when I do:



$mp = 2 * 0.37 = 0.74$
$mn = 2 * 0.63 = 1.26$



I get than $mn > mp$, so:



$mp = lfloor 0.74 rfloor = 0 $
$mn = lceil 1.26 rceil = 2$



I don't use any of the $+$ elements in four subsets. This is an error.



My problem is I don't know what to do with the decimals.



How can I do to keep the proportion and avoid these kind of problems?










share|cite|improve this question













I have a set of elements. To simplify here, these elements are $+$ and $-$.



I need to split this set into n sets keeping the proportion of $+$ and $-$. These new sets need to have similar size: their size must differ in one or zero.



For example, if I have a set with 7 $+$ and 4 $-$, and I need to split it into 5 sets. Each of these sets will have: 3 elements, 2 elements, 2 elements, 2 elements and 2 elements.



To solve this, I have done the following:




  1. Calculate the percentage of $+$ elements: $p = 4/11 = 0.37 => 37%$

  2. Calculate the percentage of $-$ elements: $n = 7/11 = 0.63 => 63%$


Now I have the percentages, I use the following formulas to calculate how many $+$ and $-$ elements I will have on each subset.



If $m$ is the number of elements on a subset, I calculate:



$mp = m * p$
$mn = m * n$



Where $mp$ is the number of $+$ elements, and $mn$ is the number of $-$ elements on each subset.



Those numbers always have decimals, so I need to round then.



If $mp > mn$ I do:



$mp = lceil mprceil$
$mn = lfloor mn rfloor $



And, if $mp < mn$ I do:



$mp = lfloor mp rfloor $
$mn = lceil mnrceil$



But it doesn't work in this example, because when I do:



$mp = 2 * 0.37 = 0.74$
$mn = 2 * 0.63 = 1.26$



I get than $mn > mp$, so:



$mp = lfloor 0.74 rfloor = 0 $
$mn = lceil 1.26 rceil = 2$



I don't use any of the $+$ elements in four subsets. This is an error.



My problem is I don't know what to do with the decimals.



How can I do to keep the proportion and avoid these kind of problems?







elementary-set-theory algorithms






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Nov 20 at 15:23









VansFannel

122113




122113








  • 1




    Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
    – hardmath
    Nov 21 at 2:33














  • 1




    Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
    – hardmath
    Nov 21 at 2:33








1




1




Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
– hardmath
Nov 21 at 2:33




Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
– hardmath
Nov 21 at 2:33










1 Answer
1






active

oldest

votes

















up vote
1
down vote



accepted










As was mentioned in the comments, there's only so much you can achieve with discrete quantities, so you have to compromise somewhere. In general there are different ways to achieve a compromise, so you'd want to check what are the desirable properties that must be enforced. I'm not too sure whether this applies in this case though.





First an example.
Say you want to play a card game, and have to deal 52 cards evenly between 5 people. Around here it is common to give one card to each player in a fixed order, until you run out of cards. That way you end up with a distribution of $11$, $11$, $10$, $10$, $10$ cards.





Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.



To do so, sort your elements in an array, say $A$ with:




  • for $0le i<P$, $Aleft[iright] = +$

  • for $Ple i<T$, $Aleft[iright] = -$


Then for $0le k < N$, let the $k$-th bucket/subset be:
$B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
Basically each bucket gets one element every $N$ elements.





With this method, you're guaranteed that:




  • All elements are used

  • Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$

  • A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements

  • A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements

  • Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.

  • In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.


I was too lazy to look how far/close the proportions you obtain are from/to the target proportion... but assuming all elements must be distributed, the four possible ratio look like best approximations.



On a side note, this method simply generalizes to more than two types of elements.
Also if you let $ell$ be the number of types of elements, I think this method can be implemented in $O(ell+N)$ if you just need the quantities of each element in each set. Kinda too lazy to check that last statement.






share|cite|improve this answer





















    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "69"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3006450%2fsplit-a-set-keeping-the-proportion-of-its-elements-on-each-subset%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote



    accepted










    As was mentioned in the comments, there's only so much you can achieve with discrete quantities, so you have to compromise somewhere. In general there are different ways to achieve a compromise, so you'd want to check what are the desirable properties that must be enforced. I'm not too sure whether this applies in this case though.





    First an example.
    Say you want to play a card game, and have to deal 52 cards evenly between 5 people. Around here it is common to give one card to each player in a fixed order, until you run out of cards. That way you end up with a distribution of $11$, $11$, $10$, $10$, $10$ cards.





    Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.



    To do so, sort your elements in an array, say $A$ with:




    • for $0le i<P$, $Aleft[iright] = +$

    • for $Ple i<T$, $Aleft[iright] = -$


    Then for $0le k < N$, let the $k$-th bucket/subset be:
    $B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
    Basically each bucket gets one element every $N$ elements.





    With this method, you're guaranteed that:




    • All elements are used

    • Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$

    • A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements

    • A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements

    • Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.

    • In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.


    I was too lazy to look how far/close the proportions you obtain are from/to the target proportion... but assuming all elements must be distributed, the four possible ratio look like best approximations.



    On a side note, this method simply generalizes to more than two types of elements.
    Also if you let $ell$ be the number of types of elements, I think this method can be implemented in $O(ell+N)$ if you just need the quantities of each element in each set. Kinda too lazy to check that last statement.






    share|cite|improve this answer

























      up vote
      1
      down vote



      accepted










      As was mentioned in the comments, there's only so much you can achieve with discrete quantities, so you have to compromise somewhere. In general there are different ways to achieve a compromise, so you'd want to check what are the desirable properties that must be enforced. I'm not too sure whether this applies in this case though.





      First an example.
      Say you want to play a card game, and have to deal 52 cards evenly between 5 people. Around here it is common to give one card to each player in a fixed order, until you run out of cards. That way you end up with a distribution of $11$, $11$, $10$, $10$, $10$ cards.





      Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.



      To do so, sort your elements in an array, say $A$ with:




      • for $0le i<P$, $Aleft[iright] = +$

      • for $Ple i<T$, $Aleft[iright] = -$


      Then for $0le k < N$, let the $k$-th bucket/subset be:
      $B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
      Basically each bucket gets one element every $N$ elements.





      With this method, you're guaranteed that:




      • All elements are used

      • Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$

      • A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements

      • A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements

      • Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.

      • In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.


      I was too lazy to look how far/close the proportions you obtain are from/to the target proportion... but assuming all elements must be distributed, the four possible ratio look like best approximations.



      On a side note, this method simply generalizes to more than two types of elements.
      Also if you let $ell$ be the number of types of elements, I think this method can be implemented in $O(ell+N)$ if you just need the quantities of each element in each set. Kinda too lazy to check that last statement.






      share|cite|improve this answer























        up vote
        1
        down vote



        accepted







        up vote
        1
        down vote



        accepted






        As was mentioned in the comments, there's only so much you can achieve with discrete quantities, so you have to compromise somewhere. In general there are different ways to achieve a compromise, so you'd want to check what are the desirable properties that must be enforced. I'm not too sure whether this applies in this case though.





        First an example.
        Say you want to play a card game, and have to deal 52 cards evenly between 5 people. Around here it is common to give one card to each player in a fixed order, until you run out of cards. That way you end up with a distribution of $11$, $11$, $10$, $10$, $10$ cards.





        Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.



        To do so, sort your elements in an array, say $A$ with:




        • for $0le i<P$, $Aleft[iright] = +$

        • for $Ple i<T$, $Aleft[iright] = -$


        Then for $0le k < N$, let the $k$-th bucket/subset be:
        $B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
        Basically each bucket gets one element every $N$ elements.





        With this method, you're guaranteed that:




        • All elements are used

        • Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$

        • A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements

        • A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements

        • Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.

        • In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.


        I was too lazy to look how far/close the proportions you obtain are from/to the target proportion... but assuming all elements must be distributed, the four possible ratio look like best approximations.



        On a side note, this method simply generalizes to more than two types of elements.
        Also if you let $ell$ be the number of types of elements, I think this method can be implemented in $O(ell+N)$ if you just need the quantities of each element in each set. Kinda too lazy to check that last statement.






        share|cite|improve this answer












        As was mentioned in the comments, there's only so much you can achieve with discrete quantities, so you have to compromise somewhere. In general there are different ways to achieve a compromise, so you'd want to check what are the desirable properties that must be enforced. I'm not too sure whether this applies in this case though.





        First an example.
        Say you want to play a card game, and have to deal 52 cards evenly between 5 people. Around here it is common to give one card to each player in a fixed order, until you run out of cards. That way you end up with a distribution of $11$, $11$, $10$, $10$, $10$ cards.





        Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.



        To do so, sort your elements in an array, say $A$ with:




        • for $0le i<P$, $Aleft[iright] = +$

        • for $Ple i<T$, $Aleft[iright] = -$


        Then for $0le k < N$, let the $k$-th bucket/subset be:
        $B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
        Basically each bucket gets one element every $N$ elements.





        With this method, you're guaranteed that:




        • All elements are used

        • Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$

        • A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements

        • A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements

        • Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.

        • In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.


        I was too lazy to look how far/close the proportions you obtain are from/to the target proportion... but assuming all elements must be distributed, the four possible ratio look like best approximations.



        On a side note, this method simply generalizes to more than two types of elements.
        Also if you let $ell$ be the number of types of elements, I think this method can be implemented in $O(ell+N)$ if you just need the quantities of each element in each set. Kinda too lazy to check that last statement.







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered Nov 21 at 23:11









        N.Bach

        1,526310




        1,526310






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Mathematics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3006450%2fsplit-a-set-keeping-the-proportion-of-its-elements-on-each-subset%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Quarter-circle Tiles

            build a pushdown automaton that recognizes the reverse language of a given pushdown automaton?

            Mont Emei