Split a set keeping the proportion of its elements on each subset

up vote
0
down vote

favorite

I have a set of elements. To simplify here, these elements are $+$ and $-$.

I need to split this set into n sets keeping the proportion of $+$ and $-$. These new sets need to have similar size: their size must differ in one or zero.

For example, if I have a set with 7 $+$ and 4 $-$, and I need to split it into 5 sets. Each of these sets will have: 3 elements, 2 elements, 2 elements, 2 elements and 2 elements.

To solve this, I have done the following:

Calculate the percentage of $+$ elements: $p = 4/11 = 0.37 => 37%$

Calculate the percentage of $-$ elements: $n = 7/11 = 0.63 => 63%$

Now I have the percentages, I use the following formulas to calculate how many $+$ and $-$ elements I will have on each subset.

If $m$ is the number of elements on a subset, I calculate:

$mp = m * p$
$mn = m * n$

Where $mp$ is the number of $+$ elements, and $mn$ is the number of $-$ elements on each subset.

Those numbers always have decimals, so I need to round then.

If $mp > mn$ I do:

$mp = lceil mprceil$
$mn = lfloor mn rfloor $

And, if $mp < mn$ I do:

$mp = lfloor mp rfloor $
$mn = lceil mnrceil$

But it doesn't work in this example, because when I do:

$mp = 2 * 0.37 = 0.74$
$mn = 2 * 0.63 = 1.26$

I get than $mn > mp$, so:

$mp = lfloor 0.74 rfloor = 0 $
$mn = lceil 1.26 rceil = 2$

I don't use any of the $+$ elements in four subsets. This is an error.

My problem is I don't know what to do with the decimals.

How can I do to keep the proportion and avoid these kind of problems?

asked Nov 20 at 15:23

VansFannel

122113

1

Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
– hardmath
Nov 21 at 2:33

add a comment |

up vote
0
down vote

favorite

I have a set of elements. To simplify here, these elements are $+$ and $-$.

I need to split this set into n sets keeping the proportion of $+$ and $-$. These new sets need to have similar size: their size must differ in one or zero.

For example, if I have a set with 7 $+$ and 4 $-$, and I need to split it into 5 sets. Each of these sets will have: 3 elements, 2 elements, 2 elements, 2 elements and 2 elements.

To solve this, I have done the following:

Calculate the percentage of $+$ elements: $p = 4/11 = 0.37 => 37%$

Calculate the percentage of $-$ elements: $n = 7/11 = 0.63 => 63%$

Now I have the percentages, I use the following formulas to calculate how many $+$ and $-$ elements I will have on each subset.

If $m$ is the number of elements on a subset, I calculate:

$mp = m * p$
$mn = m * n$

Where $mp$ is the number of $+$ elements, and $mn$ is the number of $-$ elements on each subset.

Those numbers always have decimals, so I need to round then.

If $mp > mn$ I do:

$mp = lceil mprceil$
$mn = lfloor mn rfloor $

And, if $mp < mn$ I do:

$mp = lfloor mp rfloor $
$mn = lceil mnrceil$

But it doesn't work in this example, because when I do:

$mp = 2 * 0.37 = 0.74$
$mn = 2 * 0.63 = 1.26$

I get than $mn > mp$, so:

$mp = lfloor 0.74 rfloor = 0 $
$mn = lceil 1.26 rceil = 2$

I don't use any of the $+$ elements in four subsets. This is an error.

My problem is I don't know what to do with the decimals.

How can I do to keep the proportion and avoid these kind of problems?

asked Nov 20 at 15:23

VansFannel

122113

1

Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
– hardmath
Nov 21 at 2:33

add a comment |

up vote
0
down vote

favorite

I have a set of elements. To simplify here, these elements are $+$ and $-$.

I need to split this set into n sets keeping the proportion of $+$ and $-$. These new sets need to have similar size: their size must differ in one or zero.

For example, if I have a set with 7 $+$ and 4 $-$, and I need to split it into 5 sets. Each of these sets will have: 3 elements, 2 elements, 2 elements, 2 elements and 2 elements.

To solve this, I have done the following:

Calculate the percentage of $+$ elements: $p = 4/11 = 0.37 => 37%$

Calculate the percentage of $-$ elements: $n = 7/11 = 0.63 => 63%$

Now I have the percentages, I use the following formulas to calculate how many $+$ and $-$ elements I will have on each subset.

If $m$ is the number of elements on a subset, I calculate:

$mp = m * p$
$mn = m * n$

Where $mp$ is the number of $+$ elements, and $mn$ is the number of $-$ elements on each subset.

Those numbers always have decimals, so I need to round then.

If $mp > mn$ I do:

$mp = lceil mprceil$
$mn = lfloor mn rfloor $

And, if $mp < mn$ I do:

$mp = lfloor mp rfloor $
$mn = lceil mnrceil$

But it doesn't work in this example, because when I do:

$mp = 2 * 0.37 = 0.74$
$mn = 2 * 0.63 = 1.26$

I get than $mn > mp$, so:

$mp = lfloor 0.74 rfloor = 0 $
$mn = lceil 1.26 rceil = 2$

I don't use any of the $+$ elements in four subsets. This is an error.

My problem is I don't know what to do with the decimals.

How can I do to keep the proportion and avoid these kind of problems?

asked Nov 20 at 15:23

VansFannel

122113

I have a set of elements. To simplify here, these elements are $+$ and $-$.

I need to split this set into n sets keeping the proportion of $+$ and $-$. These new sets need to have similar size: their size must differ in one or zero.

For example, if I have a set with 7 $+$ and 4 $-$, and I need to split it into 5 sets. Each of these sets will have: 3 elements, 2 elements, 2 elements, 2 elements and 2 elements.

To solve this, I have done the following:

Calculate the percentage of $+$ elements: $p = 4/11 = 0.37 => 37%$

Calculate the percentage of $-$ elements: $n = 7/11 = 0.63 => 63%$

Now I have the percentages, I use the following formulas to calculate how many $+$ and $-$ elements I will have on each subset.

If $m$ is the number of elements on a subset, I calculate:

$mp = m * p$
$mn = m * n$

Where $mp$ is the number of $+$ elements, and $mn$ is the number of $-$ elements on each subset.

Those numbers always have decimals, so I need to round then.

If $mp > mn$ I do:

$mp = lceil mprceil$
$mn = lfloor mn rfloor $

And, if $mp < mn$ I do:

$mp = lfloor mp rfloor $
$mn = lceil mnrceil$

But it doesn't work in this example, because when I do:

$mp = 2 * 0.37 = 0.74$
$mn = 2 * 0.63 = 1.26$

I get than $mn > mp$, so:

$mp = lfloor 0.74 rfloor = 0 $
$mn = lceil 1.26 rceil = 2$

I don't use any of the $+$ elements in four subsets. This is an error.

My problem is I don't know what to do with the decimals.

How can I do to keep the proportion and avoid these kind of problems?

elementary-set-theory algorithms

asked Nov 20 at 15:23

VansFannel

122113

asked Nov 20 at 15:23

VansFannel

122113

asked Nov 20 at 15:23

VansFannel

122113

asked Nov 20 at 15:23

VansFannel

122113

asked Nov 20 at 15:23

VansFannel

122113

1

Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
– hardmath
Nov 21 at 2:33

add a comment |

1

Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
– hardmath
Nov 21 at 2:33

Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
– hardmath
Nov 21 at 2:33

add a comment |

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

As was mentioned in the comments, there's only so much you can achieve with discrete quantities, so you have to compromise somewhere. In general there are different ways to achieve a compromise, so you'd want to check what are the desirable properties that must be enforced. I'm not too sure whether this applies in this case though.

First an example.
Say you want to play a card game, and have to deal 52 cards evenly between 5 people. Around here it is common to give one card to each player in a fixed order, until you run out of cards. That way you end up with a distribution of $11$, $11$, $10$, $10$, $10$ cards.

Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.

To do so, sort your elements in an array, say $A$ with:

for $0le i<P$, $Aleft[iright] = +$

for $Ple i<T$, $Aleft[iright] = -$

Then for $0le k < N$, let the $k$-th bucket/subset be:
$B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
Basically each bucket gets one element every $N$ elements.

With this method, you're guaranteed that:

All elements are used

Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$

A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements

A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements

Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.

In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.

I was too lazy to look how far/close the proportions you obtain are from/to the target proportion... but assuming all elements must be distributed, the four possible ratio look like best approximations.

On a side note, this method simply generalizes to more than two types of elements.
Also if you let $ell$ be the number of types of elements, I think this method can be implemented in $O(ell+N)$ if you just need the quantities of each element in each set. Kinda too lazy to check that last statement.

answered Nov 21 at 23:11

N.Bach

1,526310

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3006450%2fsplit-a-set-keeping-the-proportion-of-its-elements-on-each-subset%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.

To do so, sort your elements in an array, say $A$ with:

for $0le i<P$, $Aleft[iright] = +$

for $Ple i<T$, $Aleft[iright] = -$

Then for $0le k < N$, let the $k$-th bucket/subset be:
$B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
Basically each bucket gets one element every $N$ elements.

With this method, you're guaranteed that:

All elements are used

Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$

A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements

A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements

Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.

In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.

answered Nov 21 at 23:11

N.Bach

1,526310

add a comment |

up vote
1
down vote

accepted

Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.

To do so, sort your elements in an array, say $A$ with:

for $0le i<P$, $Aleft[iright] = +$

for $Ple i<T$, $Aleft[iright] = -$

Then for $0le k < N$, let the $k$-th bucket/subset be:
$B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
Basically each bucket gets one element every $N$ elements.

With this method, you're guaranteed that:

All elements are used

Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$

A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements

A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements

Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.

In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.

answered Nov 21 at 23:11

N.Bach

1,526310

add a comment |

up vote
1
down vote

accepted

Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.

To do so, sort your elements in an array, say $A$ with:

for $0le i<P$, $Aleft[iright] = +$

for $Ple i<T$, $Aleft[iright] = -$

Then for $0le k < N$, let the $k$-th bucket/subset be:
$B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
Basically each bucket gets one element every $N$ elements.

With this method, you're guaranteed that:

All elements are used

Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$

A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements

A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements

Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.

In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.

answered Nov 21 at 23:11

N.Bach

1,526310

Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.

To do so, sort your elements in an array, say $A$ with:

for $0le i<P$, $Aleft[iright] = +$

for $Ple i<T$, $Aleft[iright] = -$

Then for $0le k < N$, let the $k$-th bucket/subset be:
$B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
Basically each bucket gets one element every $N$ elements.

With this method, you're guaranteed that:

All elements are used

Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$

A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements

A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements

Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.

In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.

answered Nov 21 at 23:11

N.Bach

1,526310

answered Nov 21 at 23:11

N.Bach

1,526310

answered Nov 21 at 23:11

N.Bach

1,526310

answered Nov 21 at 23:11

N.Bach

1,526310

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

eP,gq4QUuwH rF,ZeFOPqvqr4J0e4

搜尋此網誌

Krdytkyu