Radix sort function in R

up vote
1
down vote

favorite

I know that sort in R uses radix sort, and I played around in implementing it (mainly to be sure I understood the algorithm properly). However, my implementation is an order of magnitude slower than the native implementation in R.

I'm curious: is there a way to implement radix sort in R so that it wouldn't be so slow?

My code:

get_digit <- function(x, d) {

    # digits from the right

    # i.e.: first digit is the ones, second is the tens, etc.

    (x %% 10^d) %/% (10^(d-1))

}



radix_sort <- function(x) {

    # k is maximum number of digits

    k <- max(nchar(x))

    for(i in 1:k) {

        x_digit_i <- get_digit(x, i)

        # split numbers based on their i digit

        x_by_bucket <- split(x, x_digit_i)

        # recombine the vectors, now sorted to the i'th digit

        x <- unlist(x_by_bucket, use.names = FALSE)

    }

    # since each iteration is a stable sort, the final result

    # is a sorted array, yay!

    x

}

Checking the running time:

> library(microbenchmark)

> x <- sample(100)

> microbenchmark(radix_sort(x), sort(x))

Unit: microseconds

          expr     min      lq      mean   median      uq     max neval cld

 radix_sort(x) 459.378 465.895 485.58322 480.4835 496.779 649.956   100   b

       sort(x)  27.314  29.487  33.05064  31.9710  33.212  73.563   100  a 

> x <- sample(10000)

> microbenchmark(radix_sort(x), sort(x))

Unit: microseconds

          expr       min         lq       mean     median        uq       max

 radix_sort(x) 44317.123 44777.8965 46062.3446 45204.3715 45714.807 63838.148

       sort(x)   158.609   165.7485   198.3083   186.6995   206.099   750.832

edited Nov 7 at 14:58

200_success

127k15148410

asked Nov 7 at 14:19

Tal Galili

1114

4

Welcome, currently your title makes your question sound off-topic and so is likely the cause of your downvote. This is as asking us to write algorithms for you is off-topic. I'd recommend that you reword it like the body of your question.
– Peilonrayz
Nov 7 at 14:38

I don't know what kind of answer you expect. I already told you why I believe it impossible to come close to R's implementation which is in compiled code. If you really want something fast, you can't implement it in pure R. That's why we have Rcpp.
– Roland
Nov 7 at 15:31

1

I've told you before and I'll say it again. Profile your code! Hadley covers how to do that as well in his book. I bet you'll see that most of the time is spent in the split calls, probably followed by unlist. If you want to speed this up, you need to avoid these.
– Roland
Nov 8 at 7:45

1

Then here is another idea to avoid split: start with z <- outer(x_digit_i, 0:9, "=="). Then you you can do x <- x[row(z)[z]]. Or slightly faster, x <- x[1L + (which(z) - 1L) %% length(x)]
– flodel
Nov 9 at 0:24

1

And another small improvement: %%, %/%, and == are faster with integers, so use (x %% as.integer(10^d)) %/% as.integer(10^(d-1)) inside get_digit. It might also make the code more robust to floating point error issues when using == like I suggested.
– flodel
Nov 9 at 1:42

|
show 5 more comments

up vote
1
down vote

favorite

I'm curious: is there a way to implement radix sort in R so that it wouldn't be so slow?

My code:

get_digit <- function(x, d) {

    # digits from the right

    # i.e.: first digit is the ones, second is the tens, etc.

    (x %% 10^d) %/% (10^(d-1))

}



radix_sort <- function(x) {

    # k is maximum number of digits

    k <- max(nchar(x))

    for(i in 1:k) {

        x_digit_i <- get_digit(x, i)

        # split numbers based on their i digit

        x_by_bucket <- split(x, x_digit_i)

        # recombine the vectors, now sorted to the i'th digit

        x <- unlist(x_by_bucket, use.names = FALSE)

    }

    # since each iteration is a stable sort, the final result

    # is a sorted array, yay!

    x

}

Checking the running time:

> library(microbenchmark)

> x <- sample(100)

> microbenchmark(radix_sort(x), sort(x))

Unit: microseconds

          expr     min      lq      mean   median      uq     max neval cld

 radix_sort(x) 459.378 465.895 485.58322 480.4835 496.779 649.956   100   b

       sort(x)  27.314  29.487  33.05064  31.9710  33.212  73.563   100  a 

> x <- sample(10000)

> microbenchmark(radix_sort(x), sort(x))

Unit: microseconds

          expr       min         lq       mean     median        uq       max

 radix_sort(x) 44317.123 44777.8965 46062.3446 45204.3715 45714.807 63838.148

       sort(x)   158.609   165.7485   198.3083   186.6995   206.099   750.832

edited Nov 7 at 14:58

200_success

127k15148410

asked Nov 7 at 14:19

Tal Galili

1114

4

Welcome, currently your title makes your question sound off-topic and so is likely the cause of your downvote. This is as asking us to write algorithms for you is off-topic. I'd recommend that you reword it like the body of your question.
– Peilonrayz
Nov 7 at 14:38

I don't know what kind of answer you expect. I already told you why I believe it impossible to come close to R's implementation which is in compiled code. If you really want something fast, you can't implement it in pure R. That's why we have Rcpp.
– Roland
Nov 7 at 15:31

1

I've told you before and I'll say it again. Profile your code! Hadley covers how to do that as well in his book. I bet you'll see that most of the time is spent in the split calls, probably followed by unlist. If you want to speed this up, you need to avoid these.
– Roland
Nov 8 at 7:45

1

Then here is another idea to avoid split: start with z <- outer(x_digit_i, 0:9, "=="). Then you you can do x <- x[row(z)[z]]. Or slightly faster, x <- x[1L + (which(z) - 1L) %% length(x)]
– flodel
Nov 9 at 0:24

1

And another small improvement: %%, %/%, and == are faster with integers, so use (x %% as.integer(10^d)) %/% as.integer(10^(d-1)) inside get_digit. It might also make the code more robust to floating point error issues when using == like I suggested.
– flodel
Nov 9 at 1:42

|
show 5 more comments

up vote
1
down vote

favorite

I'm curious: is there a way to implement radix sort in R so that it wouldn't be so slow?

My code:

get_digit <- function(x, d) {

    # digits from the right

    # i.e.: first digit is the ones, second is the tens, etc.

    (x %% 10^d) %/% (10^(d-1))

}



radix_sort <- function(x) {

    # k is maximum number of digits

    k <- max(nchar(x))

    for(i in 1:k) {

        x_digit_i <- get_digit(x, i)

        # split numbers based on their i digit

        x_by_bucket <- split(x, x_digit_i)

        # recombine the vectors, now sorted to the i'th digit

        x <- unlist(x_by_bucket, use.names = FALSE)

    }

    # since each iteration is a stable sort, the final result

    # is a sorted array, yay!

    x

}

Checking the running time:

> library(microbenchmark)

> x <- sample(100)

> microbenchmark(radix_sort(x), sort(x))

Unit: microseconds

          expr     min      lq      mean   median      uq     max neval cld

 radix_sort(x) 459.378 465.895 485.58322 480.4835 496.779 649.956   100   b

       sort(x)  27.314  29.487  33.05064  31.9710  33.212  73.563   100  a 

> x <- sample(10000)

> microbenchmark(radix_sort(x), sort(x))

Unit: microseconds

          expr       min         lq       mean     median        uq       max

 radix_sort(x) 44317.123 44777.8965 46062.3446 45204.3715 45714.807 63838.148

       sort(x)   158.609   165.7485   198.3083   186.6995   206.099   750.832

edited Nov 7 at 14:58

200_success

127k15148410

asked Nov 7 at 14:19

Tal Galili

1114

I'm curious: is there a way to implement radix sort in R so that it wouldn't be so slow?

My code:

get_digit <- function(x, d) {

    # digits from the right

    # i.e.: first digit is the ones, second is the tens, etc.

    (x %% 10^d) %/% (10^(d-1))

}



radix_sort <- function(x) {

    # k is maximum number of digits

    k <- max(nchar(x))

    for(i in 1:k) {

        x_digit_i <- get_digit(x, i)

        # split numbers based on their i digit

        x_by_bucket <- split(x, x_digit_i)

        # recombine the vectors, now sorted to the i'th digit

        x <- unlist(x_by_bucket, use.names = FALSE)

    }

    # since each iteration is a stable sort, the final result

    # is a sorted array, yay!

    x

}

Checking the running time:

> library(microbenchmark)

> x <- sample(100)

> microbenchmark(radix_sort(x), sort(x))

Unit: microseconds

          expr     min      lq      mean   median      uq     max neval cld

 radix_sort(x) 459.378 465.895 485.58322 480.4835 496.779 649.956   100   b

       sort(x)  27.314  29.487  33.05064  31.9710  33.212  73.563   100  a 

> x <- sample(10000)

> microbenchmark(radix_sort(x), sort(x))

Unit: microseconds

          expr       min         lq       mean     median        uq       max

 radix_sort(x) 44317.123 44777.8965 46062.3446 45204.3715 45714.807 63838.148

       sort(x)   158.609   165.7485   198.3083   186.6995   206.099   750.832

performance r radix-sort

edited Nov 7 at 14:58

200_success

127k15148410

asked Nov 7 at 14:19

Tal Galili

1114

edited Nov 7 at 14:58

200_success

127k15148410

asked Nov 7 at 14:19

Tal Galili

1114

edited Nov 7 at 14:58

200_success

127k15148410

edited Nov 7 at 14:58

200_success

127k15148410

edited Nov 7 at 14:58

200_success

127k15148410

asked Nov 7 at 14:19

Tal Galili

1114

asked Nov 7 at 14:19

Tal Galili

1114

asked Nov 7 at 14:19

Tal Galili

1114

4

Welcome, currently your title makes your question sound off-topic and so is likely the cause of your downvote. This is as asking us to write algorithms for you is off-topic. I'd recommend that you reword it like the body of your question.
– Peilonrayz
Nov 7 at 14:38

I don't know what kind of answer you expect. I already told you why I believe it impossible to come close to R's implementation which is in compiled code. If you really want something fast, you can't implement it in pure R. That's why we have Rcpp.
– Roland
Nov 7 at 15:31

1

I've told you before and I'll say it again. Profile your code! Hadley covers how to do that as well in his book. I bet you'll see that most of the time is spent in the split calls, probably followed by unlist. If you want to speed this up, you need to avoid these.
– Roland
Nov 8 at 7:45

1

Then here is another idea to avoid split: start with z <- outer(x_digit_i, 0:9, "=="). Then you you can do x <- x[row(z)[z]]. Or slightly faster, x <- x[1L + (which(z) - 1L) %% length(x)]
– flodel
Nov 9 at 0:24

1

And another small improvement: %%, %/%, and == are faster with integers, so use (x %% as.integer(10^d)) %/% as.integer(10^(d-1)) inside get_digit. It might also make the code more robust to floating point error issues when using == like I suggested.
– flodel
Nov 9 at 1:42

|
show 5 more comments

4

Welcome, currently your title makes your question sound off-topic and so is likely the cause of your downvote. This is as asking us to write algorithms for you is off-topic. I'd recommend that you reword it like the body of your question.
– Peilonrayz
Nov 7 at 14:38

I don't know what kind of answer you expect. I already told you why I believe it impossible to come close to R's implementation which is in compiled code. If you really want something fast, you can't implement it in pure R. That's why we have Rcpp.
– Roland
Nov 7 at 15:31

1

I've told you before and I'll say it again. Profile your code! Hadley covers how to do that as well in his book. I bet you'll see that most of the time is spent in the split calls, probably followed by unlist. If you want to speed this up, you need to avoid these.
– Roland
Nov 8 at 7:45

1

Then here is another idea to avoid split: start with z <- outer(x_digit_i, 0:9, "=="). Then you you can do x <- x[row(z)[z]]. Or slightly faster, x <- x[1L + (which(z) - 1L) %% length(x)]
– flodel
Nov 9 at 0:24

1

And another small improvement: %%, %/%, and == are faster with integers, so use (x %% as.integer(10^d)) %/% as.integer(10^(d-1)) inside get_digit. It might also make the code more robust to floating point error issues when using == like I suggested.
– flodel
Nov 9 at 1:42

Welcome, currently your title makes your question sound off-topic and so is likely the cause of your downvote. This is as asking us to write algorithms for you is off-topic. I'd recommend that you reword it like the body of your question.
– Peilonrayz
Nov 7 at 14:38

I don't know what kind of answer you expect. I already told you why I believe it impossible to come close to R's implementation which is in compiled code. If you really want something fast, you can't implement it in pure R. That's why we have Rcpp.
– Roland
Nov 7 at 15:31

I've told you before and I'll say it again. Profile your code! Hadley covers how to do that as well in his book. I bet you'll see that most of the time is spent in the split calls, probably followed by unlist. If you want to speed this up, you need to avoid these.
– Roland
Nov 8 at 7:45

Then here is another idea to avoid split: start with z <- outer(x_digit_i, 0:9, "=="). Then you you can do x <- x[row(z)[z]]. Or slightly faster, x <- x[1L + (which(z) - 1L) %% length(x)]
– flodel
Nov 9 at 0:24

And another small improvement: %%, %/%, and == are faster with integers, so use (x %% as.integer(10^d)) %/% as.integer(10^(d-1)) inside get_digit. It might also make the code more robust to floating point error issues when using == like I suggested.
– flodel
Nov 9 at 1:42

|
show 5 more comments

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Putting these comments into an answer...

As pointed out, it will be a tall order to beat or even approach the computation times of the native sort function, as it is compiled from C. Often with R, the trick to make your code faster consists in composing with some of these fast compiled building blocks that are native functions. In particular, we would look to substitute for loops like the one you have with vectorized functions. Unfortunately, your use of a for loop here is particular in the sense that each iteration has a side-effect to the x vector. This means that the order of the operations is important and we cannot run the loop iterations independently in parallel or via vectorized functions.

If we cannot get rid of the for loop, we are left trying to optimize the code within the body of the loop. Let's start with a profile of your code:

x <- sample(10000)

Rprof(tmp <- tempfile())

for (i in 1:10) z <- radix_sort(x)

Rprof()

summaryRprof(tmp)$by.total

#                         total.time total.pct self.time self.pct

# "radix_sort"                  8.26     99.76      0.72     8.70

# "split"                       7.34     88.65      0.06     0.72

# "split.default"               7.28     87.92      0.54     6.52

# "as.factor"                   6.74     81.40      0.08     0.97

# "factor"                      6.64     80.19      1.72    20.77

# "as.character"                4.34     52.42      4.34    52.42

# "unique"                      0.42      5.07      0.04     0.48

# "unique.default"              0.38      4.59      0.38     4.59

# "%%"                          0.14      1.69      0.14     1.69

# "get_digit"                   0.14      1.69      0.00     0.00

# "sort.list"                   0.12      1.45      0.02     0.24

# "order"                       0.08      0.97      0.06     0.72

# "unlist"                      0.06      0.72      0.06     0.72

# [...]

We see that the main culprit here is the use of split. Another surprise here is the use of order (it seems to be applied when figuring out the levels of x_digit_i) which you could consider like cheating given your objective.

So what alternative do we have to your use of split/unsplit? Essentially, you have a vector x that you want to reorder based on a vector of digits x_digit_i. One way is to use outer to create a matrix of TRUE/FALSE where each column locates a different digit (for a total of ten columns):

z <- outer(x_digit_i, 0:9, "==")

Then, you want to turn this matrix into a vector of indices, such that the first few indices will locate the zeroes, then the ones, etc (the equivalent of idx <- order(x_digit_i). You can do so by doing:

idx <- row(z)[z]

or (harder to understand but a bit faster)

idx <- 1L + (which(z) - 1L) %% length(x)

Finally, you just have to do:

x <- x[idx]

Also note that since you are dealing with integers, your code might be a little faster (and more robust by avoiding floating point errors) if you make sure to use integers everywhere possible. In particular, your get_digit function could be rewritten as follows:

get_digit <- function(x, d) (x %% as.integer(10^d)) %/% as.integer(10^(d-1))

The benchmarks below show decent progress (where my suggestions are implemented under the name sort_radix2) bridging the gap with the native sort. I hope it helps!

x <- sample(100)

microbenchmark(radix_sort(x), radix_sort2(x), sort(x))

# Unit: microseconds

#            expr     min       lq       mean   median       uq      max neval

#   radix_sort(x) 964.692 972.3675 1025.35180 984.3775 1012.178 2233.397   100

#  radix_sort2(x) 250.642 256.5720  282.58952 261.2910  282.449 1266.061   100

#         sort(x)  82.270  86.1605   92.22669  88.0230   90.943  223.249   100



x <- sample(10000)

microbenchmark(radix_sort(x), radix_sort2(x), sort(x))

# Unit: microseconds

#            expr       min         lq       mean     median         uq        max neval

#   radix_sort(x) 71939.706 76147.1715 80028.7541 78389.8140 81512.4140 144632.484   100

#  radix_sort2(x) 24218.810 27613.3190 34841.8724 29477.7115 31772.9415 143283.337   100

#         sort(x)   411.691   454.4015   563.4825   492.6165   558.0925   3412.719   100

edited 2 days ago

answered Nov 18 at 2:02

flodel

3,2021915

Thanks Flodel, that was instructive :)
– Tal Galili
2 days ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207144%2fradix-sort-function-in-r%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Putting these comments into an answer...

If we cannot get rid of the for loop, we are left trying to optimize the code within the body of the loop. Let's start with a profile of your code:

x <- sample(10000)

Rprof(tmp <- tempfile())

for (i in 1:10) z <- radix_sort(x)

Rprof()

summaryRprof(tmp)$by.total

#                         total.time total.pct self.time self.pct

# "radix_sort"                  8.26     99.76      0.72     8.70

# "split"                       7.34     88.65      0.06     0.72

# "split.default"               7.28     87.92      0.54     6.52

# "as.factor"                   6.74     81.40      0.08     0.97

# "factor"                      6.64     80.19      1.72    20.77

# "as.character"                4.34     52.42      4.34    52.42

# "unique"                      0.42      5.07      0.04     0.48

# "unique.default"              0.38      4.59      0.38     4.59

# "%%"                          0.14      1.69      0.14     1.69

# "get_digit"                   0.14      1.69      0.00     0.00

# "sort.list"                   0.12      1.45      0.02     0.24

# "order"                       0.08      0.97      0.06     0.72

# "unlist"                      0.06      0.72      0.06     0.72

# [...]

z <- outer(x_digit_i, 0:9, "==")

idx <- row(z)[z]

or (harder to understand but a bit faster)

idx <- 1L + (which(z) - 1L) %% length(x)

Finally, you just have to do:

x <- x[idx]

get_digit <- function(x, d) (x %% as.integer(10^d)) %/% as.integer(10^(d-1))

The benchmarks below show decent progress (where my suggestions are implemented under the name sort_radix2) bridging the gap with the native sort. I hope it helps!

x <- sample(100)

microbenchmark(radix_sort(x), radix_sort2(x), sort(x))

# Unit: microseconds

#            expr     min       lq       mean   median       uq      max neval

#   radix_sort(x) 964.692 972.3675 1025.35180 984.3775 1012.178 2233.397   100

#  radix_sort2(x) 250.642 256.5720  282.58952 261.2910  282.449 1266.061   100

#         sort(x)  82.270  86.1605   92.22669  88.0230   90.943  223.249   100



x <- sample(10000)

microbenchmark(radix_sort(x), radix_sort2(x), sort(x))

# Unit: microseconds

#            expr       min         lq       mean     median         uq        max neval

#   radix_sort(x) 71939.706 76147.1715 80028.7541 78389.8140 81512.4140 144632.484   100

#  radix_sort2(x) 24218.810 27613.3190 34841.8724 29477.7115 31772.9415 143283.337   100

#         sort(x)   411.691   454.4015   563.4825   492.6165   558.0925   3412.719   100

edited 2 days ago

answered Nov 18 at 2:02

flodel

3,2021915

Thanks Flodel, that was instructive :)
– Tal Galili
2 days ago

add a comment |

up vote
1
down vote

accepted

Putting these comments into an answer...

If we cannot get rid of the for loop, we are left trying to optimize the code within the body of the loop. Let's start with a profile of your code:

x <- sample(10000)

Rprof(tmp <- tempfile())

for (i in 1:10) z <- radix_sort(x)

Rprof()

summaryRprof(tmp)$by.total

#                         total.time total.pct self.time self.pct

# "radix_sort"                  8.26     99.76      0.72     8.70

# "split"                       7.34     88.65      0.06     0.72

# "split.default"               7.28     87.92      0.54     6.52

# "as.factor"                   6.74     81.40      0.08     0.97

# "factor"                      6.64     80.19      1.72    20.77

# "as.character"                4.34     52.42      4.34    52.42

# "unique"                      0.42      5.07      0.04     0.48

# "unique.default"              0.38      4.59      0.38     4.59

# "%%"                          0.14      1.69      0.14     1.69

# "get_digit"                   0.14      1.69      0.00     0.00

# "sort.list"                   0.12      1.45      0.02     0.24

# "order"                       0.08      0.97      0.06     0.72

# "unlist"                      0.06      0.72      0.06     0.72

# [...]

z <- outer(x_digit_i, 0:9, "==")

idx <- row(z)[z]

or (harder to understand but a bit faster)

idx <- 1L + (which(z) - 1L) %% length(x)

Finally, you just have to do:

x <- x[idx]

get_digit <- function(x, d) (x %% as.integer(10^d)) %/% as.integer(10^(d-1))

The benchmarks below show decent progress (where my suggestions are implemented under the name sort_radix2) bridging the gap with the native sort. I hope it helps!

x <- sample(100)

microbenchmark(radix_sort(x), radix_sort2(x), sort(x))

# Unit: microseconds

#            expr     min       lq       mean   median       uq      max neval

#   radix_sort(x) 964.692 972.3675 1025.35180 984.3775 1012.178 2233.397   100

#  radix_sort2(x) 250.642 256.5720  282.58952 261.2910  282.449 1266.061   100

#         sort(x)  82.270  86.1605   92.22669  88.0230   90.943  223.249   100



x <- sample(10000)

microbenchmark(radix_sort(x), radix_sort2(x), sort(x))

# Unit: microseconds

#            expr       min         lq       mean     median         uq        max neval

#   radix_sort(x) 71939.706 76147.1715 80028.7541 78389.8140 81512.4140 144632.484   100

#  radix_sort2(x) 24218.810 27613.3190 34841.8724 29477.7115 31772.9415 143283.337   100

#         sort(x)   411.691   454.4015   563.4825   492.6165   558.0925   3412.719   100

edited 2 days ago

answered Nov 18 at 2:02

flodel

3,2021915

Thanks Flodel, that was instructive :)
– Tal Galili
2 days ago

add a comment |

up vote
1
down vote

accepted

Putting these comments into an answer...

If we cannot get rid of the for loop, we are left trying to optimize the code within the body of the loop. Let's start with a profile of your code:

x <- sample(10000)

Rprof(tmp <- tempfile())

for (i in 1:10) z <- radix_sort(x)

Rprof()

summaryRprof(tmp)$by.total

#                         total.time total.pct self.time self.pct

# "radix_sort"                  8.26     99.76      0.72     8.70

# "split"                       7.34     88.65      0.06     0.72

# "split.default"               7.28     87.92      0.54     6.52

# "as.factor"                   6.74     81.40      0.08     0.97

# "factor"                      6.64     80.19      1.72    20.77

# "as.character"                4.34     52.42      4.34    52.42

# "unique"                      0.42      5.07      0.04     0.48

# "unique.default"              0.38      4.59      0.38     4.59

# "%%"                          0.14      1.69      0.14     1.69

# "get_digit"                   0.14      1.69      0.00     0.00

# "sort.list"                   0.12      1.45      0.02     0.24

# "order"                       0.08      0.97      0.06     0.72

# "unlist"                      0.06      0.72      0.06     0.72

# [...]

z <- outer(x_digit_i, 0:9, "==")

idx <- row(z)[z]

or (harder to understand but a bit faster)

idx <- 1L + (which(z) - 1L) %% length(x)

Finally, you just have to do:

x <- x[idx]

get_digit <- function(x, d) (x %% as.integer(10^d)) %/% as.integer(10^(d-1))

The benchmarks below show decent progress (where my suggestions are implemented under the name sort_radix2) bridging the gap with the native sort. I hope it helps!

x <- sample(100)

microbenchmark(radix_sort(x), radix_sort2(x), sort(x))

# Unit: microseconds

#            expr     min       lq       mean   median       uq      max neval

#   radix_sort(x) 964.692 972.3675 1025.35180 984.3775 1012.178 2233.397   100

#  radix_sort2(x) 250.642 256.5720  282.58952 261.2910  282.449 1266.061   100

#         sort(x)  82.270  86.1605   92.22669  88.0230   90.943  223.249   100



x <- sample(10000)

microbenchmark(radix_sort(x), radix_sort2(x), sort(x))

# Unit: microseconds

#            expr       min         lq       mean     median         uq        max neval

#   radix_sort(x) 71939.706 76147.1715 80028.7541 78389.8140 81512.4140 144632.484   100

#  radix_sort2(x) 24218.810 27613.3190 34841.8724 29477.7115 31772.9415 143283.337   100

#         sort(x)   411.691   454.4015   563.4825   492.6165   558.0925   3412.719   100

edited 2 days ago

answered Nov 18 at 2:02

flodel

3,2021915

Putting these comments into an answer...

If we cannot get rid of the for loop, we are left trying to optimize the code within the body of the loop. Let's start with a profile of your code:

x <- sample(10000)

Rprof(tmp <- tempfile())

for (i in 1:10) z <- radix_sort(x)

Rprof()

summaryRprof(tmp)$by.total

#                         total.time total.pct self.time self.pct

# "radix_sort"                  8.26     99.76      0.72     8.70

# "split"                       7.34     88.65      0.06     0.72

# "split.default"               7.28     87.92      0.54     6.52

# "as.factor"                   6.74     81.40      0.08     0.97

# "factor"                      6.64     80.19      1.72    20.77

# "as.character"                4.34     52.42      4.34    52.42

# "unique"                      0.42      5.07      0.04     0.48

# "unique.default"              0.38      4.59      0.38     4.59

# "%%"                          0.14      1.69      0.14     1.69

# "get_digit"                   0.14      1.69      0.00     0.00

# "sort.list"                   0.12      1.45      0.02     0.24

# "order"                       0.08      0.97      0.06     0.72

# "unlist"                      0.06      0.72      0.06     0.72

# [...]

z <- outer(x_digit_i, 0:9, "==")

idx <- row(z)[z]

or (harder to understand but a bit faster)

idx <- 1L + (which(z) - 1L) %% length(x)

Finally, you just have to do:

x <- x[idx]

get_digit <- function(x, d) (x %% as.integer(10^d)) %/% as.integer(10^(d-1))

The benchmarks below show decent progress (where my suggestions are implemented under the name sort_radix2) bridging the gap with the native sort. I hope it helps!

x <- sample(100)

microbenchmark(radix_sort(x), radix_sort2(x), sort(x))

# Unit: microseconds

#            expr     min       lq       mean   median       uq      max neval

#   radix_sort(x) 964.692 972.3675 1025.35180 984.3775 1012.178 2233.397   100

#  radix_sort2(x) 250.642 256.5720  282.58952 261.2910  282.449 1266.061   100

#         sort(x)  82.270  86.1605   92.22669  88.0230   90.943  223.249   100



x <- sample(10000)

microbenchmark(radix_sort(x), radix_sort2(x), sort(x))

# Unit: microseconds

#            expr       min         lq       mean     median         uq        max neval

#   radix_sort(x) 71939.706 76147.1715 80028.7541 78389.8140 81512.4140 144632.484   100

#  radix_sort2(x) 24218.810 27613.3190 34841.8724 29477.7115 31772.9415 143283.337   100

#         sort(x)   411.691   454.4015   563.4825   492.6165   558.0925   3412.719   100

edited 2 days ago

answered Nov 18 at 2:02

flodel

3,2021915

edited 2 days ago

answered Nov 18 at 2:02

flodel

3,2021915

answered Nov 18 at 2:02

flodel

3,2021915

answered Nov 18 at 2:02

flodel

3,2021915

Thanks Flodel, that was instructive :)
– Tal Galili
2 days ago

add a comment |

Thanks Flodel, that was instructive :)
– Tal Galili
2 days ago

Thanks Flodel, that was instructive :)
– Tal Galili
2 days ago

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Mq eYZnlzs gxPwn3HCh

搜尋此網誌

Krdytkyu