Bash sort array according to length of elements?
up vote
9
down vote
favorite
Given an array of strings, I would like to sort the array according to the length of each element.
For example...
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
Should sort to...
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
(As a bonus, it would be nice if the list sorted strings of the same length, alphabetically. In the above example medium string
was sorted before middle string
even though they are the same length. But that's not a "hard" requirement, if it over complicates the solution).
It is OK if the array is sorted in-place (i.e. "array" is modified) or if a new sorted array is created.
bash shell-script sort array
New contributor
add a comment |
up vote
9
down vote
favorite
Given an array of strings, I would like to sort the array according to the length of each element.
For example...
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
Should sort to...
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
(As a bonus, it would be nice if the list sorted strings of the same length, alphabetically. In the above example medium string
was sorted before middle string
even though they are the same length. But that's not a "hard" requirement, if it over complicates the solution).
It is OK if the array is sorted in-place (i.e. "array" is modified) or if a new sorted array is created.
bash shell-script sort array
New contributor
1
some interesting answers over here, you should be able to adapt one to test for string length as well stackoverflow.com/a/30576368/2876682
– frostschutz
Nov 17 at 20:20
add a comment |
up vote
9
down vote
favorite
up vote
9
down vote
favorite
Given an array of strings, I would like to sort the array according to the length of each element.
For example...
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
Should sort to...
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
(As a bonus, it would be nice if the list sorted strings of the same length, alphabetically. In the above example medium string
was sorted before middle string
even though they are the same length. But that's not a "hard" requirement, if it over complicates the solution).
It is OK if the array is sorted in-place (i.e. "array" is modified) or if a new sorted array is created.
bash shell-script sort array
New contributor
Given an array of strings, I would like to sort the array according to the length of each element.
For example...
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
Should sort to...
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
(As a bonus, it would be nice if the list sorted strings of the same length, alphabetically. In the above example medium string
was sorted before middle string
even though they are the same length. But that's not a "hard" requirement, if it over complicates the solution).
It is OK if the array is sorted in-place (i.e. "array" is modified) or if a new sorted array is created.
bash shell-script sort array
bash shell-script sort array
New contributor
New contributor
New contributor
asked Nov 17 at 20:11
PJ Singh
1543
1543
New contributor
New contributor
1
some interesting answers over here, you should be able to adapt one to test for string length as well stackoverflow.com/a/30576368/2876682
– frostschutz
Nov 17 at 20:20
add a comment |
1
some interesting answers over here, you should be able to adapt one to test for string length as well stackoverflow.com/a/30576368/2876682
– frostschutz
Nov 17 at 20:20
1
1
some interesting answers over here, you should be able to adapt one to test for string length as well stackoverflow.com/a/30576368/2876682
– frostschutz
Nov 17 at 20:20
some interesting answers over here, you should be able to adapt one to test for string length as well stackoverflow.com/a/30576368/2876682
– frostschutz
Nov 17 at 20:20
add a comment |
6 Answers
6
active
oldest
votes
up vote
10
down vote
accepted
If the strings don't contain newlines, the following should work. It sorts the indices of the array by the length, using the strings themselves as the secondary sort criterion.
#!/bin/bash
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
expected=(
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
)
indexes=( $(
for i in "${!array[@]}" ; do
printf '%s %s %sn' $i "${#array[i]}" "${array[i]}"
done | sort -nrk2,2 -rk3 | cut -f1 -d' '
))
for i in "${indexes[@]}" ; do
sorted+=("${array[i]}")
done
diff <(echo "${expected[@]}")
<(echo "${sorted[@]}")
Note that moving to a real programming language can greatly simplify the solution, e.g. in Perl, you can just
sort { length $b <=> length $a or $a cmp $b } @array
In Python:sorted(array, key=lambda s: (len(s), s))
– wjandrea
2 days ago
In Ruby:array.sort { |a| a.size }
– Dmitry Kudriavtsev
2 days ago
add a comment |
up vote
8
down vote
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
This reads the values of the sorted array from a process substitution.
The process substitution contains a loop. The loop output each element of the array prepended by the element's length and a tab character in-between.
The output of the loop is sorted numerically from largest to smallest (and alphabetically if the lengths are the same; use -k 2r
in place of -k 2
to reverse the alphabetical order) and the result of that is sent to cut
which deletes the column with the string lengths.
Sort test script followed by a test run:
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
printf '%sn' "${array[@]}"
$ bash script.sh
the longest string in the list
also a medium string
medium string
middle string
short string
tiny string
This assumes that the strings do not contain newlines. On GNU systems with a recent bash
, you can support embedded newlines in the data by using the nul-character as the record separator instead of newline:
readarray -d '' -t array < <(
for str in "${array[@]}"; do
printf '%dt%s' "${#str}" "$str"
done | sort -z -k 1,1nr -k 2 | cut -z -f 2- )
Here, the data is printed with trailing in the loop instead of newlines, the
sort
and cut
reads nul-delimited lines through their -z
GNU options and readarray
finally reads the nul-delimited data with -d ''
.
3
Note that-d ''
is in fact-d ''
asbash
can't pass NUL characters to commands, even its builtins. But it does understand-d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.
– Stéphane Chazelas
2 days ago
dat edit seems familiar, doesn't it?
– Isaac
2 days ago
@StéphaneChazelas No, it is not''
, it is$''
. And yes, it converts (almost exactly) to''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.
– Isaac
2 days ago
@Isaac Sorry, which edit?
– Kusalananda
2 days ago
1
I should have said, and I diddn't, sorry: No, not at all, just that two .... etc. Having said that to ease my mind: I am still playing with it a little bit, hoping to expand solutions a little more forward. Maybe I will edit my answer (again), and thanks for confirming that the solution is useful as it stands (it looks similar to yours 😛 ).
– Isaac
2 days ago
|
show 4 more comments
up vote
4
down vote
I won't completely repeat what I've already said about sorting in bash, just you can sort within bash, but maybe you shouldn't. Below is a bash-only implementation of an insertion sort, which is O(n2), and so is only tolerable for small arrays. It sorts the array elements in-place by their length, in decreasing order. It does not do a secondary alphabetical sort.
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
function sort_inplace {
local i j tmp
for ((i=0; i <= ${#array[@]} - 2; i++))
do
for ((j=i + 1; j <= ${#array[@]} - 1; j++))
do
local ivalue jvalue
ivalue=${#array[i]}
jvalue=${#array[j]}
if [[ $ivalue < $jvalue ]]
then
tmp=${array[i]}
array[i]=${array[j]}
array[j]=$tmp
fi
done
done
}
echo Initial:
declare -p array
sort_inplace
echo Sorted:
declare -p array
As evidence that this is a specialized solution, consider the timings of the existing three answers on various size arrays:
# 6 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.018s ## already 4 times slower!
# 1000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.021s ## up to 5 times slower, now!
5000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.019s
# 10000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.006s
Jeff: 0m0.020s
# 99000 elements
Choroba: 0m0.015s
Kusalananda: 0m0.012s
Jeff: 0m0.119s
Choroba and Kusalananda have the right idea: compute the lengths once and use dedicated utilities for sorting and text processing.
add a comment |
up vote
4
down vote
A hackish? (complex) and fast one line way to sort the array by length
(safe for newlines and sparse arrays):
#!/bin/bash
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
"test * string"
"*"
"?"
"[abc]"
)
readarray -td $'' sorted < <(
for i in "${in[@]}"
do printf '%s %s' "${#i}" "$i";
done |
sort -bz -k1,1rn -k2 |
cut -zd " " -f2-
)
printf '%sn' "${sorted[@]}"
On one line:
readarray -td $'' sorted < <(for i in "${in[@]}";do printf '%s %s' "${#i}" "$i"; done | sort -bz -k1,1rn -k2 | cut -zd " " -f2-)
On execution
$ ./script
the longest
string also containing
newlines
also a medium string
medium string
middle string
test * string
short string
tiny string
[abc]
?
*
add a comment |
up vote
4
down vote
This also handles array elements with newlines in them; it works by passing through sort
only the length and the index of each element. It should work with bash
and ksh
.
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
)
out=()
unset IFS
for a in $(for i in ${!in[@]}; do echo ${#in[i]}/$i; done | sort -rn); do
out+=("${in[${a#*/}]}")
done
for a in "${out[@]}"; do printf '"%s"n' "$a"; done
If the elements of the same length also have to be sorted lexicographically, the loop could be changed like this:
IFS='
'
for a in $(for i in ${!in[@]}; do printf '%sn' "$i ${#in[i]} ${in[i]//$IFS/ }"; done | sort -k 2,2nr -k 3 | cut -d' ' -f1); do
out+=("${in[$a]}")
done
This will also pass to sort
the strings (with newlines changed to spaces), but they would still be copied from the source to the destination array by their indexes. In both examples, the $(...)
will see only lines containing numbers (and the /
character in the first example), so it won't be tripped by globbing characters or spaces in the strings.
@Isaac there's no need for quoting${!in[@]}
, becauseunset IFS
resets it to space,tab,newline, and quoting those variables wouldn't suffice anyway, because the$(...)
command substitution is itself split with IFS.
– mosvy
Nov 18 at 4:20
No, the strings are still copied from thein
to theout
array by their index. The${//}
substituted ones are built just for the sake ofsort
.
– mosvy
2 days ago
Cleaned comments. Now it breaks ifin
contains something like"testing * here"
andshopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.
– Isaac
2 days ago
Cannot reproduce. In the second example, the$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of thecut -d' ' -f1
after the sort. This could be easily demonstrated by atee /dev/tty
at the end of the$(...)
.
– mosvy
yesterday
Sorry, my bad, I missed thecut
.
– Stéphane Chazelas
yesterday
add a comment |
up vote
3
down vote
In case switching to zsh
is an option, a hackish way there (for arrays containing any sequence of bytes):
array=('' blah $'xnynz' $'xy' '1 2 3')
sorted_array=( /(e'{reply=("$array[@]")}'nOe'{REPLY=$#REPLY}') )
zsh
allows defining sort orders for its glob expansion via glob qualifiers. So here, we're tricking it to do it for arbitrary arrays by globbing on /
, but replacing /
with the elements of the array (e'{reply=("$array[@]")}'
) and then n
umerically o
rder (in reverse with uppercase O
) the elements based on their length (Oe'{REPLY=$#REPLY}'
).
Note that it's based on the length in number of characters. For number of bytes, set the locale to C
(LC_ALL=C
).
Another bash
4.4+ approach (assuming not too big an array):
readarray -td '' sorted_array < <(
perl -l0 -e 'print for sort {length $b <=> length $a} @ARGV
' -- "${array[@]}")
(that's length in bytes).
With older versions of bash
, you could always do:
eval "sorted_array=($(
perl -l0 -e 'for (sort {length $b <=> length $a} @ARGV) {
'"s/'/'\\''/g"'; printf " ''%s''", $_}' -- "${array[@]}"
))"
(which would also work with ksh93
, zsh
, yash
, mksh
).
add a comment |
6 Answers
6
active
oldest
votes
6 Answers
6
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
10
down vote
accepted
If the strings don't contain newlines, the following should work. It sorts the indices of the array by the length, using the strings themselves as the secondary sort criterion.
#!/bin/bash
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
expected=(
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
)
indexes=( $(
for i in "${!array[@]}" ; do
printf '%s %s %sn' $i "${#array[i]}" "${array[i]}"
done | sort -nrk2,2 -rk3 | cut -f1 -d' '
))
for i in "${indexes[@]}" ; do
sorted+=("${array[i]}")
done
diff <(echo "${expected[@]}")
<(echo "${sorted[@]}")
Note that moving to a real programming language can greatly simplify the solution, e.g. in Perl, you can just
sort { length $b <=> length $a or $a cmp $b } @array
In Python:sorted(array, key=lambda s: (len(s), s))
– wjandrea
2 days ago
In Ruby:array.sort { |a| a.size }
– Dmitry Kudriavtsev
2 days ago
add a comment |
up vote
10
down vote
accepted
If the strings don't contain newlines, the following should work. It sorts the indices of the array by the length, using the strings themselves as the secondary sort criterion.
#!/bin/bash
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
expected=(
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
)
indexes=( $(
for i in "${!array[@]}" ; do
printf '%s %s %sn' $i "${#array[i]}" "${array[i]}"
done | sort -nrk2,2 -rk3 | cut -f1 -d' '
))
for i in "${indexes[@]}" ; do
sorted+=("${array[i]}")
done
diff <(echo "${expected[@]}")
<(echo "${sorted[@]}")
Note that moving to a real programming language can greatly simplify the solution, e.g. in Perl, you can just
sort { length $b <=> length $a or $a cmp $b } @array
In Python:sorted(array, key=lambda s: (len(s), s))
– wjandrea
2 days ago
In Ruby:array.sort { |a| a.size }
– Dmitry Kudriavtsev
2 days ago
add a comment |
up vote
10
down vote
accepted
up vote
10
down vote
accepted
If the strings don't contain newlines, the following should work. It sorts the indices of the array by the length, using the strings themselves as the secondary sort criterion.
#!/bin/bash
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
expected=(
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
)
indexes=( $(
for i in "${!array[@]}" ; do
printf '%s %s %sn' $i "${#array[i]}" "${array[i]}"
done | sort -nrk2,2 -rk3 | cut -f1 -d' '
))
for i in "${indexes[@]}" ; do
sorted+=("${array[i]}")
done
diff <(echo "${expected[@]}")
<(echo "${sorted[@]}")
Note that moving to a real programming language can greatly simplify the solution, e.g. in Perl, you can just
sort { length $b <=> length $a or $a cmp $b } @array
If the strings don't contain newlines, the following should work. It sorts the indices of the array by the length, using the strings themselves as the secondary sort criterion.
#!/bin/bash
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
expected=(
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
)
indexes=( $(
for i in "${!array[@]}" ; do
printf '%s %s %sn' $i "${#array[i]}" "${array[i]}"
done | sort -nrk2,2 -rk3 | cut -f1 -d' '
))
for i in "${indexes[@]}" ; do
sorted+=("${array[i]}")
done
diff <(echo "${expected[@]}")
<(echo "${sorted[@]}")
Note that moving to a real programming language can greatly simplify the solution, e.g. in Perl, you can just
sort { length $b <=> length $a or $a cmp $b } @array
edited Nov 17 at 20:29
answered Nov 17 at 20:21
choroba
25.7k44470
25.7k44470
In Python:sorted(array, key=lambda s: (len(s), s))
– wjandrea
2 days ago
In Ruby:array.sort { |a| a.size }
– Dmitry Kudriavtsev
2 days ago
add a comment |
In Python:sorted(array, key=lambda s: (len(s), s))
– wjandrea
2 days ago
In Ruby:array.sort { |a| a.size }
– Dmitry Kudriavtsev
2 days ago
In Python:
sorted(array, key=lambda s: (len(s), s))
– wjandrea
2 days ago
In Python:
sorted(array, key=lambda s: (len(s), s))
– wjandrea
2 days ago
In Ruby:
array.sort { |a| a.size }
– Dmitry Kudriavtsev
2 days ago
In Ruby:
array.sort { |a| a.size }
– Dmitry Kudriavtsev
2 days ago
add a comment |
up vote
8
down vote
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
This reads the values of the sorted array from a process substitution.
The process substitution contains a loop. The loop output each element of the array prepended by the element's length and a tab character in-between.
The output of the loop is sorted numerically from largest to smallest (and alphabetically if the lengths are the same; use -k 2r
in place of -k 2
to reverse the alphabetical order) and the result of that is sent to cut
which deletes the column with the string lengths.
Sort test script followed by a test run:
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
printf '%sn' "${array[@]}"
$ bash script.sh
the longest string in the list
also a medium string
medium string
middle string
short string
tiny string
This assumes that the strings do not contain newlines. On GNU systems with a recent bash
, you can support embedded newlines in the data by using the nul-character as the record separator instead of newline:
readarray -d '' -t array < <(
for str in "${array[@]}"; do
printf '%dt%s' "${#str}" "$str"
done | sort -z -k 1,1nr -k 2 | cut -z -f 2- )
Here, the data is printed with trailing in the loop instead of newlines, the
sort
and cut
reads nul-delimited lines through their -z
GNU options and readarray
finally reads the nul-delimited data with -d ''
.
3
Note that-d ''
is in fact-d ''
asbash
can't pass NUL characters to commands, even its builtins. But it does understand-d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.
– Stéphane Chazelas
2 days ago
dat edit seems familiar, doesn't it?
– Isaac
2 days ago
@StéphaneChazelas No, it is not''
, it is$''
. And yes, it converts (almost exactly) to''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.
– Isaac
2 days ago
@Isaac Sorry, which edit?
– Kusalananda
2 days ago
1
I should have said, and I diddn't, sorry: No, not at all, just that two .... etc. Having said that to ease my mind: I am still playing with it a little bit, hoping to expand solutions a little more forward. Maybe I will edit my answer (again), and thanks for confirming that the solution is useful as it stands (it looks similar to yours 😛 ).
– Isaac
2 days ago
|
show 4 more comments
up vote
8
down vote
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
This reads the values of the sorted array from a process substitution.
The process substitution contains a loop. The loop output each element of the array prepended by the element's length and a tab character in-between.
The output of the loop is sorted numerically from largest to smallest (and alphabetically if the lengths are the same; use -k 2r
in place of -k 2
to reverse the alphabetical order) and the result of that is sent to cut
which deletes the column with the string lengths.
Sort test script followed by a test run:
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
printf '%sn' "${array[@]}"
$ bash script.sh
the longest string in the list
also a medium string
medium string
middle string
short string
tiny string
This assumes that the strings do not contain newlines. On GNU systems with a recent bash
, you can support embedded newlines in the data by using the nul-character as the record separator instead of newline:
readarray -d '' -t array < <(
for str in "${array[@]}"; do
printf '%dt%s' "${#str}" "$str"
done | sort -z -k 1,1nr -k 2 | cut -z -f 2- )
Here, the data is printed with trailing in the loop instead of newlines, the
sort
and cut
reads nul-delimited lines through their -z
GNU options and readarray
finally reads the nul-delimited data with -d ''
.
3
Note that-d ''
is in fact-d ''
asbash
can't pass NUL characters to commands, even its builtins. But it does understand-d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.
– Stéphane Chazelas
2 days ago
dat edit seems familiar, doesn't it?
– Isaac
2 days ago
@StéphaneChazelas No, it is not''
, it is$''
. And yes, it converts (almost exactly) to''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.
– Isaac
2 days ago
@Isaac Sorry, which edit?
– Kusalananda
2 days ago
1
I should have said, and I diddn't, sorry: No, not at all, just that two .... etc. Having said that to ease my mind: I am still playing with it a little bit, hoping to expand solutions a little more forward. Maybe I will edit my answer (again), and thanks for confirming that the solution is useful as it stands (it looks similar to yours 😛 ).
– Isaac
2 days ago
|
show 4 more comments
up vote
8
down vote
up vote
8
down vote
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
This reads the values of the sorted array from a process substitution.
The process substitution contains a loop. The loop output each element of the array prepended by the element's length and a tab character in-between.
The output of the loop is sorted numerically from largest to smallest (and alphabetically if the lengths are the same; use -k 2r
in place of -k 2
to reverse the alphabetical order) and the result of that is sent to cut
which deletes the column with the string lengths.
Sort test script followed by a test run:
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
printf '%sn' "${array[@]}"
$ bash script.sh
the longest string in the list
also a medium string
medium string
middle string
short string
tiny string
This assumes that the strings do not contain newlines. On GNU systems with a recent bash
, you can support embedded newlines in the data by using the nul-character as the record separator instead of newline:
readarray -d '' -t array < <(
for str in "${array[@]}"; do
printf '%dt%s' "${#str}" "$str"
done | sort -z -k 1,1nr -k 2 | cut -z -f 2- )
Here, the data is printed with trailing in the loop instead of newlines, the
sort
and cut
reads nul-delimited lines through their -z
GNU options and readarray
finally reads the nul-delimited data with -d ''
.
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
This reads the values of the sorted array from a process substitution.
The process substitution contains a loop. The loop output each element of the array prepended by the element's length and a tab character in-between.
The output of the loop is sorted numerically from largest to smallest (and alphabetically if the lengths are the same; use -k 2r
in place of -k 2
to reverse the alphabetical order) and the result of that is sent to cut
which deletes the column with the string lengths.
Sort test script followed by a test run:
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
printf '%sn' "${array[@]}"
$ bash script.sh
the longest string in the list
also a medium string
medium string
middle string
short string
tiny string
This assumes that the strings do not contain newlines. On GNU systems with a recent bash
, you can support embedded newlines in the data by using the nul-character as the record separator instead of newline:
readarray -d '' -t array < <(
for str in "${array[@]}"; do
printf '%dt%s' "${#str}" "$str"
done | sort -z -k 1,1nr -k 2 | cut -z -f 2- )
Here, the data is printed with trailing in the loop instead of newlines, the
sort
and cut
reads nul-delimited lines through their -z
GNU options and readarray
finally reads the nul-delimited data with -d ''
.
edited 2 days ago
answered Nov 17 at 20:36
Kusalananda
116k15218352
116k15218352
3
Note that-d ''
is in fact-d ''
asbash
can't pass NUL characters to commands, even its builtins. But it does understand-d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.
– Stéphane Chazelas
2 days ago
dat edit seems familiar, doesn't it?
– Isaac
2 days ago
@StéphaneChazelas No, it is not''
, it is$''
. And yes, it converts (almost exactly) to''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.
– Isaac
2 days ago
@Isaac Sorry, which edit?
– Kusalananda
2 days ago
1
I should have said, and I diddn't, sorry: No, not at all, just that two .... etc. Having said that to ease my mind: I am still playing with it a little bit, hoping to expand solutions a little more forward. Maybe I will edit my answer (again), and thanks for confirming that the solution is useful as it stands (it looks similar to yours 😛 ).
– Isaac
2 days ago
|
show 4 more comments
3
Note that-d ''
is in fact-d ''
asbash
can't pass NUL characters to commands, even its builtins. But it does understand-d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.
– Stéphane Chazelas
2 days ago
dat edit seems familiar, doesn't it?
– Isaac
2 days ago
@StéphaneChazelas No, it is not''
, it is$''
. And yes, it converts (almost exactly) to''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.
– Isaac
2 days ago
@Isaac Sorry, which edit?
– Kusalananda
2 days ago
1
I should have said, and I diddn't, sorry: No, not at all, just that two .... etc. Having said that to ease my mind: I am still playing with it a little bit, hoping to expand solutions a little more forward. Maybe I will edit my answer (again), and thanks for confirming that the solution is useful as it stands (it looks similar to yours 😛 ).
– Isaac
2 days ago
3
3
Note that
-d ''
is in fact -d ''
as bash
can't pass NUL characters to commands, even its builtins. But it does understand -d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.– Stéphane Chazelas
2 days ago
Note that
-d ''
is in fact -d ''
as bash
can't pass NUL characters to commands, even its builtins. But it does understand -d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.– Stéphane Chazelas
2 days ago
dat edit seems familiar, doesn't it?
– Isaac
2 days ago
dat edit seems familiar, doesn't it?
– Isaac
2 days ago
@StéphaneChazelas No, it is not
''
, it is $''
. And yes, it converts (almost exactly) to ''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.– Isaac
2 days ago
@StéphaneChazelas No, it is not
''
, it is $''
. And yes, it converts (almost exactly) to ''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.– Isaac
2 days ago
@Isaac Sorry, which edit?
– Kusalananda
2 days ago
@Isaac Sorry, which edit?
– Kusalananda
2 days ago
1
1
I should have said, and I diddn't, sorry: No, not at all, just that two .... etc. Having said that to ease my mind: I am still playing with it a little bit, hoping to expand solutions a little more forward. Maybe I will edit my answer (again), and thanks for confirming that the solution is useful as it stands (it looks similar to yours 😛 ).
– Isaac
2 days ago
I should have said, and I diddn't, sorry: No, not at all, just that two .... etc. Having said that to ease my mind: I am still playing with it a little bit, hoping to expand solutions a little more forward. Maybe I will edit my answer (again), and thanks for confirming that the solution is useful as it stands (it looks similar to yours 😛 ).
– Isaac
2 days ago
|
show 4 more comments
up vote
4
down vote
I won't completely repeat what I've already said about sorting in bash, just you can sort within bash, but maybe you shouldn't. Below is a bash-only implementation of an insertion sort, which is O(n2), and so is only tolerable for small arrays. It sorts the array elements in-place by their length, in decreasing order. It does not do a secondary alphabetical sort.
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
function sort_inplace {
local i j tmp
for ((i=0; i <= ${#array[@]} - 2; i++))
do
for ((j=i + 1; j <= ${#array[@]} - 1; j++))
do
local ivalue jvalue
ivalue=${#array[i]}
jvalue=${#array[j]}
if [[ $ivalue < $jvalue ]]
then
tmp=${array[i]}
array[i]=${array[j]}
array[j]=$tmp
fi
done
done
}
echo Initial:
declare -p array
sort_inplace
echo Sorted:
declare -p array
As evidence that this is a specialized solution, consider the timings of the existing three answers on various size arrays:
# 6 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.018s ## already 4 times slower!
# 1000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.021s ## up to 5 times slower, now!
5000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.019s
# 10000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.006s
Jeff: 0m0.020s
# 99000 elements
Choroba: 0m0.015s
Kusalananda: 0m0.012s
Jeff: 0m0.119s
Choroba and Kusalananda have the right idea: compute the lengths once and use dedicated utilities for sorting and text processing.
add a comment |
up vote
4
down vote
I won't completely repeat what I've already said about sorting in bash, just you can sort within bash, but maybe you shouldn't. Below is a bash-only implementation of an insertion sort, which is O(n2), and so is only tolerable for small arrays. It sorts the array elements in-place by their length, in decreasing order. It does not do a secondary alphabetical sort.
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
function sort_inplace {
local i j tmp
for ((i=0; i <= ${#array[@]} - 2; i++))
do
for ((j=i + 1; j <= ${#array[@]} - 1; j++))
do
local ivalue jvalue
ivalue=${#array[i]}
jvalue=${#array[j]}
if [[ $ivalue < $jvalue ]]
then
tmp=${array[i]}
array[i]=${array[j]}
array[j]=$tmp
fi
done
done
}
echo Initial:
declare -p array
sort_inplace
echo Sorted:
declare -p array
As evidence that this is a specialized solution, consider the timings of the existing three answers on various size arrays:
# 6 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.018s ## already 4 times slower!
# 1000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.021s ## up to 5 times slower, now!
5000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.019s
# 10000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.006s
Jeff: 0m0.020s
# 99000 elements
Choroba: 0m0.015s
Kusalananda: 0m0.012s
Jeff: 0m0.119s
Choroba and Kusalananda have the right idea: compute the lengths once and use dedicated utilities for sorting and text processing.
add a comment |
up vote
4
down vote
up vote
4
down vote
I won't completely repeat what I've already said about sorting in bash, just you can sort within bash, but maybe you shouldn't. Below is a bash-only implementation of an insertion sort, which is O(n2), and so is only tolerable for small arrays. It sorts the array elements in-place by their length, in decreasing order. It does not do a secondary alphabetical sort.
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
function sort_inplace {
local i j tmp
for ((i=0; i <= ${#array[@]} - 2; i++))
do
for ((j=i + 1; j <= ${#array[@]} - 1; j++))
do
local ivalue jvalue
ivalue=${#array[i]}
jvalue=${#array[j]}
if [[ $ivalue < $jvalue ]]
then
tmp=${array[i]}
array[i]=${array[j]}
array[j]=$tmp
fi
done
done
}
echo Initial:
declare -p array
sort_inplace
echo Sorted:
declare -p array
As evidence that this is a specialized solution, consider the timings of the existing three answers on various size arrays:
# 6 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.018s ## already 4 times slower!
# 1000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.021s ## up to 5 times slower, now!
5000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.019s
# 10000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.006s
Jeff: 0m0.020s
# 99000 elements
Choroba: 0m0.015s
Kusalananda: 0m0.012s
Jeff: 0m0.119s
Choroba and Kusalananda have the right idea: compute the lengths once and use dedicated utilities for sorting and text processing.
I won't completely repeat what I've already said about sorting in bash, just you can sort within bash, but maybe you shouldn't. Below is a bash-only implementation of an insertion sort, which is O(n2), and so is only tolerable for small arrays. It sorts the array elements in-place by their length, in decreasing order. It does not do a secondary alphabetical sort.
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
function sort_inplace {
local i j tmp
for ((i=0; i <= ${#array[@]} - 2; i++))
do
for ((j=i + 1; j <= ${#array[@]} - 1; j++))
do
local ivalue jvalue
ivalue=${#array[i]}
jvalue=${#array[j]}
if [[ $ivalue < $jvalue ]]
then
tmp=${array[i]}
array[i]=${array[j]}
array[j]=$tmp
fi
done
done
}
echo Initial:
declare -p array
sort_inplace
echo Sorted:
declare -p array
As evidence that this is a specialized solution, consider the timings of the existing three answers on various size arrays:
# 6 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.018s ## already 4 times slower!
# 1000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.021s ## up to 5 times slower, now!
5000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.019s
# 10000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.006s
Jeff: 0m0.020s
# 99000 elements
Choroba: 0m0.015s
Kusalananda: 0m0.012s
Jeff: 0m0.119s
Choroba and Kusalananda have the right idea: compute the lengths once and use dedicated utilities for sorting and text processing.
edited Nov 18 at 0:34
answered Nov 17 at 23:47
Jeff Schaller
36.2k952119
36.2k952119
add a comment |
add a comment |
up vote
4
down vote
A hackish? (complex) and fast one line way to sort the array by length
(safe for newlines and sparse arrays):
#!/bin/bash
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
"test * string"
"*"
"?"
"[abc]"
)
readarray -td $'' sorted < <(
for i in "${in[@]}"
do printf '%s %s' "${#i}" "$i";
done |
sort -bz -k1,1rn -k2 |
cut -zd " " -f2-
)
printf '%sn' "${sorted[@]}"
On one line:
readarray -td $'' sorted < <(for i in "${in[@]}";do printf '%s %s' "${#i}" "$i"; done | sort -bz -k1,1rn -k2 | cut -zd " " -f2-)
On execution
$ ./script
the longest
string also containing
newlines
also a medium string
medium string
middle string
test * string
short string
tiny string
[abc]
?
*
add a comment |
up vote
4
down vote
A hackish? (complex) and fast one line way to sort the array by length
(safe for newlines and sparse arrays):
#!/bin/bash
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
"test * string"
"*"
"?"
"[abc]"
)
readarray -td $'' sorted < <(
for i in "${in[@]}"
do printf '%s %s' "${#i}" "$i";
done |
sort -bz -k1,1rn -k2 |
cut -zd " " -f2-
)
printf '%sn' "${sorted[@]}"
On one line:
readarray -td $'' sorted < <(for i in "${in[@]}";do printf '%s %s' "${#i}" "$i"; done | sort -bz -k1,1rn -k2 | cut -zd " " -f2-)
On execution
$ ./script
the longest
string also containing
newlines
also a medium string
medium string
middle string
test * string
short string
tiny string
[abc]
?
*
add a comment |
up vote
4
down vote
up vote
4
down vote
A hackish? (complex) and fast one line way to sort the array by length
(safe for newlines and sparse arrays):
#!/bin/bash
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
"test * string"
"*"
"?"
"[abc]"
)
readarray -td $'' sorted < <(
for i in "${in[@]}"
do printf '%s %s' "${#i}" "$i";
done |
sort -bz -k1,1rn -k2 |
cut -zd " " -f2-
)
printf '%sn' "${sorted[@]}"
On one line:
readarray -td $'' sorted < <(for i in "${in[@]}";do printf '%s %s' "${#i}" "$i"; done | sort -bz -k1,1rn -k2 | cut -zd " " -f2-)
On execution
$ ./script
the longest
string also containing
newlines
also a medium string
medium string
middle string
test * string
short string
tiny string
[abc]
?
*
A hackish? (complex) and fast one line way to sort the array by length
(safe for newlines and sparse arrays):
#!/bin/bash
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
"test * string"
"*"
"?"
"[abc]"
)
readarray -td $'' sorted < <(
for i in "${in[@]}"
do printf '%s %s' "${#i}" "$i";
done |
sort -bz -k1,1rn -k2 |
cut -zd " " -f2-
)
printf '%sn' "${sorted[@]}"
On one line:
readarray -td $'' sorted < <(for i in "${in[@]}";do printf '%s %s' "${#i}" "$i"; done | sort -bz -k1,1rn -k2 | cut -zd " " -f2-)
On execution
$ ./script
the longest
string also containing
newlines
also a medium string
medium string
middle string
test * string
short string
tiny string
[abc]
?
*
edited 2 days ago
answered Nov 18 at 2:39
Isaac
9,63411443
9,63411443
add a comment |
add a comment |
up vote
4
down vote
This also handles array elements with newlines in them; it works by passing through sort
only the length and the index of each element. It should work with bash
and ksh
.
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
)
out=()
unset IFS
for a in $(for i in ${!in[@]}; do echo ${#in[i]}/$i; done | sort -rn); do
out+=("${in[${a#*/}]}")
done
for a in "${out[@]}"; do printf '"%s"n' "$a"; done
If the elements of the same length also have to be sorted lexicographically, the loop could be changed like this:
IFS='
'
for a in $(for i in ${!in[@]}; do printf '%sn' "$i ${#in[i]} ${in[i]//$IFS/ }"; done | sort -k 2,2nr -k 3 | cut -d' ' -f1); do
out+=("${in[$a]}")
done
This will also pass to sort
the strings (with newlines changed to spaces), but they would still be copied from the source to the destination array by their indexes. In both examples, the $(...)
will see only lines containing numbers (and the /
character in the first example), so it won't be tripped by globbing characters or spaces in the strings.
@Isaac there's no need for quoting${!in[@]}
, becauseunset IFS
resets it to space,tab,newline, and quoting those variables wouldn't suffice anyway, because the$(...)
command substitution is itself split with IFS.
– mosvy
Nov 18 at 4:20
No, the strings are still copied from thein
to theout
array by their index. The${//}
substituted ones are built just for the sake ofsort
.
– mosvy
2 days ago
Cleaned comments. Now it breaks ifin
contains something like"testing * here"
andshopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.
– Isaac
2 days ago
Cannot reproduce. In the second example, the$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of thecut -d' ' -f1
after the sort. This could be easily demonstrated by atee /dev/tty
at the end of the$(...)
.
– mosvy
yesterday
Sorry, my bad, I missed thecut
.
– Stéphane Chazelas
yesterday
add a comment |
up vote
4
down vote
This also handles array elements with newlines in them; it works by passing through sort
only the length and the index of each element. It should work with bash
and ksh
.
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
)
out=()
unset IFS
for a in $(for i in ${!in[@]}; do echo ${#in[i]}/$i; done | sort -rn); do
out+=("${in[${a#*/}]}")
done
for a in "${out[@]}"; do printf '"%s"n' "$a"; done
If the elements of the same length also have to be sorted lexicographically, the loop could be changed like this:
IFS='
'
for a in $(for i in ${!in[@]}; do printf '%sn' "$i ${#in[i]} ${in[i]//$IFS/ }"; done | sort -k 2,2nr -k 3 | cut -d' ' -f1); do
out+=("${in[$a]}")
done
This will also pass to sort
the strings (with newlines changed to spaces), but they would still be copied from the source to the destination array by their indexes. In both examples, the $(...)
will see only lines containing numbers (and the /
character in the first example), so it won't be tripped by globbing characters or spaces in the strings.
@Isaac there's no need for quoting${!in[@]}
, becauseunset IFS
resets it to space,tab,newline, and quoting those variables wouldn't suffice anyway, because the$(...)
command substitution is itself split with IFS.
– mosvy
Nov 18 at 4:20
No, the strings are still copied from thein
to theout
array by their index. The${//}
substituted ones are built just for the sake ofsort
.
– mosvy
2 days ago
Cleaned comments. Now it breaks ifin
contains something like"testing * here"
andshopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.
– Isaac
2 days ago
Cannot reproduce. In the second example, the$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of thecut -d' ' -f1
after the sort. This could be easily demonstrated by atee /dev/tty
at the end of the$(...)
.
– mosvy
yesterday
Sorry, my bad, I missed thecut
.
– Stéphane Chazelas
yesterday
add a comment |
up vote
4
down vote
up vote
4
down vote
This also handles array elements with newlines in them; it works by passing through sort
only the length and the index of each element. It should work with bash
and ksh
.
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
)
out=()
unset IFS
for a in $(for i in ${!in[@]}; do echo ${#in[i]}/$i; done | sort -rn); do
out+=("${in[${a#*/}]}")
done
for a in "${out[@]}"; do printf '"%s"n' "$a"; done
If the elements of the same length also have to be sorted lexicographically, the loop could be changed like this:
IFS='
'
for a in $(for i in ${!in[@]}; do printf '%sn' "$i ${#in[i]} ${in[i]//$IFS/ }"; done | sort -k 2,2nr -k 3 | cut -d' ' -f1); do
out+=("${in[$a]}")
done
This will also pass to sort
the strings (with newlines changed to spaces), but they would still be copied from the source to the destination array by their indexes. In both examples, the $(...)
will see only lines containing numbers (and the /
character in the first example), so it won't be tripped by globbing characters or spaces in the strings.
This also handles array elements with newlines in them; it works by passing through sort
only the length and the index of each element. It should work with bash
and ksh
.
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
)
out=()
unset IFS
for a in $(for i in ${!in[@]}; do echo ${#in[i]}/$i; done | sort -rn); do
out+=("${in[${a#*/}]}")
done
for a in "${out[@]}"; do printf '"%s"n' "$a"; done
If the elements of the same length also have to be sorted lexicographically, the loop could be changed like this:
IFS='
'
for a in $(for i in ${!in[@]}; do printf '%sn' "$i ${#in[i]} ${in[i]//$IFS/ }"; done | sort -k 2,2nr -k 3 | cut -d' ' -f1); do
out+=("${in[$a]}")
done
This will also pass to sort
the strings (with newlines changed to spaces), but they would still be copied from the source to the destination array by their indexes. In both examples, the $(...)
will see only lines containing numbers (and the /
character in the first example), so it won't be tripped by globbing characters or spaces in the strings.
edited yesterday
answered Nov 18 at 1:34
mosvy
4,333321
4,333321
@Isaac there's no need for quoting${!in[@]}
, becauseunset IFS
resets it to space,tab,newline, and quoting those variables wouldn't suffice anyway, because the$(...)
command substitution is itself split with IFS.
– mosvy
Nov 18 at 4:20
No, the strings are still copied from thein
to theout
array by their index. The${//}
substituted ones are built just for the sake ofsort
.
– mosvy
2 days ago
Cleaned comments. Now it breaks ifin
contains something like"testing * here"
andshopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.
– Isaac
2 days ago
Cannot reproduce. In the second example, the$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of thecut -d' ' -f1
after the sort. This could be easily demonstrated by atee /dev/tty
at the end of the$(...)
.
– mosvy
yesterday
Sorry, my bad, I missed thecut
.
– Stéphane Chazelas
yesterday
add a comment |
@Isaac there's no need for quoting${!in[@]}
, becauseunset IFS
resets it to space,tab,newline, and quoting those variables wouldn't suffice anyway, because the$(...)
command substitution is itself split with IFS.
– mosvy
Nov 18 at 4:20
No, the strings are still copied from thein
to theout
array by their index. The${//}
substituted ones are built just for the sake ofsort
.
– mosvy
2 days ago
Cleaned comments. Now it breaks ifin
contains something like"testing * here"
andshopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.
– Isaac
2 days ago
Cannot reproduce. In the second example, the$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of thecut -d' ' -f1
after the sort. This could be easily demonstrated by atee /dev/tty
at the end of the$(...)
.
– mosvy
yesterday
Sorry, my bad, I missed thecut
.
– Stéphane Chazelas
yesterday
@Isaac there's no need for quoting
${!in[@]}
, because unset IFS
resets it to space,tab,newline, and quoting those variables wouldn't suffice anyway, because the $(...)
command substitution is itself split with IFS.– mosvy
Nov 18 at 4:20
@Isaac there's no need for quoting
${!in[@]}
, because unset IFS
resets it to space,tab,newline, and quoting those variables wouldn't suffice anyway, because the $(...)
command substitution is itself split with IFS.– mosvy
Nov 18 at 4:20
No, the strings are still copied from the
in
to the out
array by their index. The ${//}
substituted ones are built just for the sake of sort
.– mosvy
2 days ago
No, the strings are still copied from the
in
to the out
array by their index. The ${//}
substituted ones are built just for the sake of sort
.– mosvy
2 days ago
Cleaned comments. Now it breaks if
in
contains something like "testing * here"
and shopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.– Isaac
2 days ago
Cleaned comments. Now it breaks if
in
contains something like "testing * here"
and shopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.– Isaac
2 days ago
Cannot reproduce. In the second example, the
$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of the cut -d' ' -f1
after the sort. This could be easily demonstrated by a tee /dev/tty
at the end of the $(...)
.– mosvy
yesterday
Cannot reproduce. In the second example, the
$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of the cut -d' ' -f1
after the sort. This could be easily demonstrated by a tee /dev/tty
at the end of the $(...)
.– mosvy
yesterday
Sorry, my bad, I missed the
cut
.– Stéphane Chazelas
yesterday
Sorry, my bad, I missed the
cut
.– Stéphane Chazelas
yesterday
add a comment |
up vote
3
down vote
In case switching to zsh
is an option, a hackish way there (for arrays containing any sequence of bytes):
array=('' blah $'xnynz' $'xy' '1 2 3')
sorted_array=( /(e'{reply=("$array[@]")}'nOe'{REPLY=$#REPLY}') )
zsh
allows defining sort orders for its glob expansion via glob qualifiers. So here, we're tricking it to do it for arbitrary arrays by globbing on /
, but replacing /
with the elements of the array (e'{reply=("$array[@]")}'
) and then n
umerically o
rder (in reverse with uppercase O
) the elements based on their length (Oe'{REPLY=$#REPLY}'
).
Note that it's based on the length in number of characters. For number of bytes, set the locale to C
(LC_ALL=C
).
Another bash
4.4+ approach (assuming not too big an array):
readarray -td '' sorted_array < <(
perl -l0 -e 'print for sort {length $b <=> length $a} @ARGV
' -- "${array[@]}")
(that's length in bytes).
With older versions of bash
, you could always do:
eval "sorted_array=($(
perl -l0 -e 'for (sort {length $b <=> length $a} @ARGV) {
'"s/'/'\\''/g"'; printf " ''%s''", $_}' -- "${array[@]}"
))"
(which would also work with ksh93
, zsh
, yash
, mksh
).
add a comment |
up vote
3
down vote
In case switching to zsh
is an option, a hackish way there (for arrays containing any sequence of bytes):
array=('' blah $'xnynz' $'xy' '1 2 3')
sorted_array=( /(e'{reply=("$array[@]")}'nOe'{REPLY=$#REPLY}') )
zsh
allows defining sort orders for its glob expansion via glob qualifiers. So here, we're tricking it to do it for arbitrary arrays by globbing on /
, but replacing /
with the elements of the array (e'{reply=("$array[@]")}'
) and then n
umerically o
rder (in reverse with uppercase O
) the elements based on their length (Oe'{REPLY=$#REPLY}'
).
Note that it's based on the length in number of characters. For number of bytes, set the locale to C
(LC_ALL=C
).
Another bash
4.4+ approach (assuming not too big an array):
readarray -td '' sorted_array < <(
perl -l0 -e 'print for sort {length $b <=> length $a} @ARGV
' -- "${array[@]}")
(that's length in bytes).
With older versions of bash
, you could always do:
eval "sorted_array=($(
perl -l0 -e 'for (sort {length $b <=> length $a} @ARGV) {
'"s/'/'\\''/g"'; printf " ''%s''", $_}' -- "${array[@]}"
))"
(which would also work with ksh93
, zsh
, yash
, mksh
).
add a comment |
up vote
3
down vote
up vote
3
down vote
In case switching to zsh
is an option, a hackish way there (for arrays containing any sequence of bytes):
array=('' blah $'xnynz' $'xy' '1 2 3')
sorted_array=( /(e'{reply=("$array[@]")}'nOe'{REPLY=$#REPLY}') )
zsh
allows defining sort orders for its glob expansion via glob qualifiers. So here, we're tricking it to do it for arbitrary arrays by globbing on /
, but replacing /
with the elements of the array (e'{reply=("$array[@]")}'
) and then n
umerically o
rder (in reverse with uppercase O
) the elements based on their length (Oe'{REPLY=$#REPLY}'
).
Note that it's based on the length in number of characters. For number of bytes, set the locale to C
(LC_ALL=C
).
Another bash
4.4+ approach (assuming not too big an array):
readarray -td '' sorted_array < <(
perl -l0 -e 'print for sort {length $b <=> length $a} @ARGV
' -- "${array[@]}")
(that's length in bytes).
With older versions of bash
, you could always do:
eval "sorted_array=($(
perl -l0 -e 'for (sort {length $b <=> length $a} @ARGV) {
'"s/'/'\\''/g"'; printf " ''%s''", $_}' -- "${array[@]}"
))"
(which would also work with ksh93
, zsh
, yash
, mksh
).
In case switching to zsh
is an option, a hackish way there (for arrays containing any sequence of bytes):
array=('' blah $'xnynz' $'xy' '1 2 3')
sorted_array=( /(e'{reply=("$array[@]")}'nOe'{REPLY=$#REPLY}') )
zsh
allows defining sort orders for its glob expansion via glob qualifiers. So here, we're tricking it to do it for arbitrary arrays by globbing on /
, but replacing /
with the elements of the array (e'{reply=("$array[@]")}'
) and then n
umerically o
rder (in reverse with uppercase O
) the elements based on their length (Oe'{REPLY=$#REPLY}'
).
Note that it's based on the length in number of characters. For number of bytes, set the locale to C
(LC_ALL=C
).
Another bash
4.4+ approach (assuming not too big an array):
readarray -td '' sorted_array < <(
perl -l0 -e 'print for sort {length $b <=> length $a} @ARGV
' -- "${array[@]}")
(that's length in bytes).
With older versions of bash
, you could always do:
eval "sorted_array=($(
perl -l0 -e 'for (sort {length $b <=> length $a} @ARGV) {
'"s/'/'\\''/g"'; printf " ''%s''", $_}' -- "${array[@]}"
))"
(which would also work with ksh93
, zsh
, yash
, mksh
).
edited yesterday
answered 2 days ago
Stéphane Chazelas
294k54551893
294k54551893
add a comment |
add a comment |
PJ Singh is a new contributor. Be nice, and check out our Code of Conduct.
PJ Singh is a new contributor. Be nice, and check out our Code of Conduct.
PJ Singh is a new contributor. Be nice, and check out our Code of Conduct.
PJ Singh is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f482393%2fbash-sort-array-according-to-length-of-elements%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
some interesting answers over here, you should be able to adapt one to test for string length as well stackoverflow.com/a/30576368/2876682
– frostschutz
Nov 17 at 20:20