[GH-ISSUE #492] Feature request: "offline" tree browsing #215

Closed
opened 2026-06-08 11:26:10 +03:00 by zhus · 4 comments
Owner

Originally created by @billsanders-bc on GitHub (May 12, 2025).
Original GitHub issue: https://github.com/bootandy/dust/issues/492

At my day job, we have a build that produces a multi-gigabyte squashfs that for various reasons has to be under a certain size. Occasionally a commit will slip through (or a dependency will change) which increases the size beyond our threshold (very often this is just megabytes in difference). When this happens, I usually have to download the image along with a "known good" one, and compare them. Dust has been fantastic for this. I open two terminals and can visually see what's different fairly easily and drill down as needed to find the culprits.

This led to me wondering if it would be possible for dust (when pointed at a directory) to produce an artifact which was just the file tree with sizes, but without any data. Sort of like a tarball of "sparse files"[0]. This artifact could be quickly downloaded and then could be "importable"/read by dust on a local system, allowing the comparison I mentioned above. I guess I'm sort of looking for a ".dust" file format. I understand this is kind of an unusual request and possibly out of scope for what you want for the project, but I thought it was worth asking in case other people use the tool in a similar way.

[0] - https://en.wikipedia.org/wiki/Sparse_file

Originally created by @billsanders-bc on GitHub (May 12, 2025). Original GitHub issue: https://github.com/bootandy/dust/issues/492 At my day job, we have a build that produces a multi-gigabyte squashfs that for various reasons has to be under a certain size. Occasionally a commit will slip through (or a dependency will change) which increases the size beyond our threshold (very often this is just megabytes in difference). When this happens, I usually have to download the image along with a "known good" one, and compare them. Dust has been *fantastic* for this. I open two terminals and can visually see what's different fairly easily and drill down as needed to find the culprits. This led to me wondering if it would be possible for dust (when pointed at a directory) to produce an artifact which was just the file tree with sizes, but without any data. Sort of like a tarball of "sparse files"[0]. This artifact could be quickly downloaded and then could be "importable"/read by dust on a local system, allowing the comparison I mentioned above. I guess I'm sort of looking for a ".dust" file format. I understand this is kind of an unusual request and possibly out of scope for what you want for the project, but I thought it was worth asking in case other people use the tool in a similar way. [0] - https://en.wikipedia.org/wiki/Sparse_file
zhus closed this issue 2026-06-08 11:26:10 +03:00
Author
Owner

@bootandy commented on GitHub (May 16, 2025):

I think I understand what you are thinking, you'd like more performance and if the underlying data doesn't change we shouldn't have to scan again.

Despite this, I'm not very keen on creating a dust 'cache file'. I think this usecase is very niche.

Could you just run:
dust >> top.txt
and then you have a cached version of what the dir should look like in top.txt

<!-- gh-comment-id:2885824825 --> @bootandy commented on GitHub (May 16, 2025): I think I understand what you are thinking, you'd like more performance and if the underlying data doesn't change we shouldn't have to scan again. Despite this, I'm not very keen on creating a dust 'cache file'. I think this usecase is very niche. Could you just run: `dust >> top.txt` and then you have a cached version of what the dir should look like in `top.txt`
Author
Owner

@billsanders-bc commented on GitHub (May 16, 2025):

Thanks for taking the time to respond. I've thought about it a bunch more and once I noticed the --output-json argument I realized my proposal above was far more exotic than necessary.

If dust could read from a json file which it had produced to display it's bar graph (without access the system the json was created on), that would be exactly what I'm looking for.

So some system in CI can just run:

dust -j -o mb -n 1000 ./build/filesystem/path/ > dust-from-buildXYZ.json

So that later I could fetch just that json file and visualize on a different system:

dust --new-json-input-flag dust-from-buildXYZ.json

and see the graph (and compare it to known good yadda yadda).

A single dust visualization isn't sufficient - the filesystem I'm interrogating is basically a Linux / - very jagged in depth and very uneven in distribution. A single call will tell me that eg. /usr/lib is bigger, but not necessarily where that extra space is being consumed, which might require additional calls to drill down.

If this was python I'd take a stab at it, but I haven't had the opportunity to learn rust yet :/. Maybe in a few weeks if things slow down a bit I'll see if I can learn enough to pull this feature off. All that said, even in this less exotic form, I do agree it's a somewhat niche request and I'd understand if you didn't want to add the complexity.

<!-- gh-comment-id:2887303791 --> @billsanders-bc commented on GitHub (May 16, 2025): Thanks for taking the time to respond. I've thought about it a bunch more and once I noticed the `--output-json` argument I realized my proposal above was far more exotic than necessary. If `dust` could *read* from a json file which it had produced to display it's bar graph (without access the system the json was created on), that would be exactly what I'm looking for. So some system in CI can just run: `dust -j -o mb -n 1000 ./build/filesystem/path/ > dust-from-buildXYZ.json` So that later I could fetch just that json file and visualize on a different system: `dust --new-json-input-flag dust-from-buildXYZ.json` and see the graph (and compare it to known good yadda yadda). A single dust visualization isn't sufficient - the filesystem I'm interrogating is basically a Linux `/` - very jagged in depth and very uneven in distribution. A single call will tell me that eg. `/usr/lib` is bigger, but not necessarily where that extra space is being consumed, which might require additional calls to drill down. If this was python I'd take a stab at it, but I haven't had the opportunity to learn rust yet :/. Maybe in a few weeks if things slow down a bit I'll see if I can learn enough to pull this feature off. All that said, even in this less exotic form, I do agree it's a somewhat niche request and I'd understand if you didn't want to add the complexity.
Author
Owner

@bootandy commented on GitHub (May 17, 2025):

What about hiding the colors and the bars, then you could run a regular diff against the 2 files?

dust  -o mb -c -b -n 100 >> master.txt

dust  -o mb -c -b -n 100 >> ci_build.txt

diff master.txt ci_build.txt

<!-- gh-comment-id:2887677187 --> @bootandy commented on GitHub (May 17, 2025): What about hiding the colors and the bars, then you could run a regular diff against the 2 files? ``` dust -o mb -c -b -n 100 >> master.txt dust -o mb -c -b -n 100 >> ci_build.txt diff master.txt ci_build.txt ```
Author
Owner

@billsanders-bc commented on GitHub (Jun 3, 2025):

Oh I missed your follow-up! Thank you for the suggestion, I adapted it a bit and something like this would indeed work for me:

diff <(dust -o mb -c -b -p -n 1000 /mnt/foo/) <(dust -o mb -c -b -p -n 1000 /mnt/toobig/) | sed -n 's/^> //p'

The sed bit basically filters out the "left-hand side" of the diff, so I see only files which are new/bigger (and -p to get full paths to them).

Thanks again for making such a neat little tool. Best wishes.

<!-- gh-comment-id:2932727431 --> @billsanders-bc commented on GitHub (Jun 3, 2025): Oh I missed your follow-up! Thank you for the suggestion, I adapted it a bit and something like this would indeed work for me: `diff <(dust -o mb -c -b -p -n 1000 /mnt/foo/) <(dust -o mb -c -b -p -n 1000 /mnt/toobig/) | sed -n 's/^> //p'` The sed bit basically filters out the "left-hand side" of the diff, so I see only files which are new/bigger (and `-p` to get full paths to them). Thanks again for making such a neat little tool. Best wishes.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: bootandy/archived-dust#215