[GH-ISSUE #544] Add a column that ranks the output entries by size #238

Closed
opened 2026-06-08 11:26:17 +03:00 by zhus · 3 comments
Owner

Originally created by @infinity0 on GitHub (Dec 9, 2025).
Original GitHub issue: https://github.com/bootandy/dust/issues/544

The output of dust is structured as the entries appear in the filesystem.

It would be good to have an additional column that ranks the sizes of the output entries, so you can clearly see which are the 1st/2nd/3rd largest entries. Note, I am talking about the default view where both files and directories are displayed, not the file-only view (-F) which is sorted already.

Ranking could be done over (1) all entries, or (2) other entries of the same type i.e. other files, or other directories, or (c) other entries at the same depth e.g. 3. I'm not sure which is the best / most useful option, but perhaps implementing all of these could be good.

Originally created by @infinity0 on GitHub (Dec 9, 2025). Original GitHub issue: https://github.com/bootandy/dust/issues/544 The output of `dust` is structured as the entries appear in the filesystem. It would be good to have an additional column that ranks the sizes of the output entries, so you can clearly see which are the 1st/2nd/3rd largest entries. Note, I am talking about the default view where both files and directories are displayed, not the file-only view (-F) which is sorted already. Ranking could be done over (1) all entries, or (2) other entries of the same type i.e. other files, or other directories, or (c) other entries at the same depth e.g. 3. I'm not sure which is the best / most useful option, but perhaps implementing all of these could be good.
zhus closed this issue 2026-06-08 11:26:17 +03:00
Author
Owner

@bootandy commented on GitHub (Jan 7, 2026):

I feel that it is already ranked. Each sub directory is ranked 'at its level'. If you run dust on a pair of folders which both contain large files you'll see:

dir2/big
dir2/biggest
dir2/
dir1/big
dir1/biggest
dir1/

(Assuming largest on bottom).
(sidenote: which way should it be: largest at the top or bottom has driven me crazy and it flipped twice in the early versions).

Here we conclude dir1 is bigger than dir2 (because it came first) and that biggest is larger than big. There is however no ordering of dir1/biggest vs dir2/biggest.

On balance I don't think I want to try and add such a column, I think it will be confusing. I feel this case is semi covered by dust -F (files only) and dust -z 10M (only show files over 10M).

Thanks for the thoughts though.

<!-- gh-comment-id:3720619808 --> @bootandy commented on GitHub (Jan 7, 2026): I feel that it is already ranked. Each sub directory is ranked 'at its level'. If you run dust on a pair of folders which both contain large files you'll see: ``` dir2/big dir2/biggest dir2/ dir1/big dir1/biggest dir1/ ``` (Assuming largest on bottom). (sidenote: which way should it be: largest at the top or bottom has driven me crazy and it flipped twice in the early versions). Here we conclude `dir1` is bigger than `dir2` (because it came first) and that `biggest` is larger than `big`. There is however no ordering of `dir1/biggest` vs `dir2/biggest`. On balance I don't think I want to try and add such a column, I think it will be confusing. I feel this case is semi covered by `dust -F` (files only) and `dust -z 10M` (only show files over 10M). Thanks for the thoughts though.
Author
Owner

@infinity0 commented on GitHub (Jan 14, 2026):

There is however no ordering of dir1/biggest vs dir2/biggest.

Yes this is precisely what I mean. This (as in (1) in what I described in the OP) should be easy to implement - the kth-ranked item is whatever is added between the output of -n $((k-1)) and -n $k.

On balance I don't think I want to try and add such a column, I think it will be confusing. I feel this case is semi covered by dust -F (files only) and dust -z 10M (only show files over 10M).

Well, the user would have to specifically request the ranking via CLI, so it's on them to deal with any potential confusion. But in fact I think this ranking would help people process and understand the output of -n and why it's superior to other tools.

I think -F and -z is insufficient for exactly the same reasons I described over in #543 - as a sysadmin you often care about the 20th largest files/directories (i.e. the output of -n), but e.g. perhaps the largest file/directory is important and you want to keep it, but then you want to remove the next largest entry (across all depths that is). When I visually browse the output of -n it's hard to actually tell what "the next largest entry" is after a particular entry (across all depths), I have to basically run something like diff <(dust -n 7) <(dust -n 8). Repeatedly doing this for several k gets a bit tedious, it would be easier to pipe dust -n 20 --show-ranks to a pager then search for rank 7 etc.

<!-- gh-comment-id:3749767117 --> @infinity0 commented on GitHub (Jan 14, 2026): > There is however no ordering of `dir1/biggest` vs `dir2/biggest`. Yes this is precisely what I mean. This (as in (1) in what I described in the OP) should be easy to implement - the kth-ranked item is whatever is added between the output of `-n $((k-1))` and `-n $k`. > On balance I don't think I want to try and add such a column, I think it will be confusing. I feel this case is semi covered by `dust -F` (files only) and `dust -z 10M` (only show files over 10M). Well, the user would have to specifically request the ranking via CLI, so it's on them to deal with any potential confusion. But in fact I think this ranking would help people process and understand the output of `-n` and why it's superior to other tools. I think `-F` and `-z` is insufficient for exactly the same reasons I described over in #543 - as a sysadmin you often care about the 20th largest files/directories (i.e. the output of `-n`), but e.g. perhaps the largest file/directory is important and you want to keep it, but then you want to remove the next largest entry (across all depths that is). When I visually browse the output of `-n` it's hard to actually tell what "the next largest entry" is after a particular entry (across all depths), I have to basically run something like `diff <(dust -n 7) <(dust -n 8)`. Repeatedly doing this for several k gets a bit tedious, it would be easier to pipe `dust -n 20 --show-ranks` to a pager then search for `rank 7` etc.
Author
Owner

@infinity0 commented on GitHub (Jan 14, 2026):

You could achieve a similar sort of effect with dust --no-colors -n20 -p -b | sort -h right now, and then ignoring the ASCII tree because that is now garbled. To make this nicer, you could do the sorting inside dust itself via a --sort flag that implies -p -b as well as hiding the tree structure. This could be a less confusing alternative than an explicit ranking column.

edit: Workaround for now to hide the tree:

dust --no-colors -n20 -p -b | sort -h | sed -r 's,^( *\S+) +([^.]+) +([\./].*)$,\1 \3,g'
<!-- gh-comment-id:3749891756 --> @infinity0 commented on GitHub (Jan 14, 2026): You could achieve a similar sort of effect with `dust --no-colors -n20 -p -b | sort -h` right now, and then ignoring the ASCII tree because that is now garbled. To make this nicer, you could do the sorting inside `dust` itself via a `--sort` flag that implies `-p -b` as well as hiding the tree structure. This could be a less confusing alternative than an explicit ranking column. edit: Workaround for now to hide the tree: ~~~~ dust --no-colors -n20 -p -b | sort -h | sed -r 's,^( *\S+) +([^.]+) +([\./].*)$,\1 \3,g' ~~~~
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: bootandy/archived-dust#238