[GH-ISSUE #17] Performance #9

Closed
opened 2026-06-08 11:25:16 +03:00 by zhus · 8 comments
Owner

Originally created by @rubdos on GitHub (May 1, 2018).
Original GitHub issue: https://github.com/bootandy/dust/issues/17

Dust is currently about 4 times slower than du.

It would be nice to know why and how it's slower. Did you add benchmarks of some sorts?

In a second phase, it's probably easy to optimize it?

Originally created by @rubdos on GitHub (May 1, 2018). Original GitHub issue: https://github.com/bootandy/dust/issues/17 > Dust is currently about 4 times slower than du. It would be nice to know why and how it's slower. Did you add benchmarks of some sorts? In a second phase, it's probably easy to optimize it?
zhus closed this issue 2026-06-08 11:25:16 +03:00
Author
Owner

@bootandy commented on GitHub (May 1, 2018):

2 things:

  1. Porting this code to use WalkDir slowed it down somewhat. However using Walkdir was recommended to me as otherwise I risk blowing the stack using recursive calls on a filesystem.

  2. Running the rust port of du https://github.com/uutils/coreutils/tree/master/src/du showed that it was several times slower than the origin c du.

I do not know why the rust port of du is slower than the original du. I do not know enough about the relative ins and outs of performance of rust accessing disk compared to the c version, but when I tried to measure performance I found rust du to be notably slower than c du (I ran several times to try and compensate for disk caches)

I haven't added benchmarks. I would like to but I don't know how to without checking in a huge test directory to github (which would be ugly). Do you have any ideas how to create a good benchmark for a disk utility tool?

<!-- gh-comment-id:385635651 --> @bootandy commented on GitHub (May 1, 2018): 2 things: 1) Porting this code to use WalkDir slowed it down somewhat. However using Walkdir was recommended to me as otherwise I risk blowing the stack using recursive calls on a filesystem. 2) Running the rust port of du https://github.com/uutils/coreutils/tree/master/src/du showed that it was several times slower than the origin c du. I do not know why the rust port of du is slower than the original du. I do not know enough about the relative ins and outs of performance of rust accessing disk compared to the c version, but when I tried to measure performance I found rust du to be notably slower than c du (I ran several times to try and compensate for disk caches) I haven't added benchmarks. I would like to but I don't know how to without checking in a huge test directory to github (which would be ugly). Do you have any ideas how to create a good benchmark for a disk utility tool?
Author
Owner

@rubdos commented on GitHub (May 4, 2018):

I haven't added benchmarks. I would like to but I don't know how to without checking in a huge test directory to github (which would be ugly). Do you have any ideas how to create a good benchmark for a disk utility tool?

Ha, was wondering about that too. I suppose you could create a bunch of files on the fly, and fill the files with random data. That way, you don't have to check it into git (which would be ugly and wasteful, especially for future contributors).
You could put them straight into /tmp, as to only test the code itself, and not be beaten by disk specific properties... but I'm not sure whether that'd make sense.

<!-- gh-comment-id:386520992 --> @rubdos commented on GitHub (May 4, 2018): > I haven't added benchmarks. I would like to but I don't know how to without checking in a huge test directory to github (which would be ugly). Do you have any ideas how to create a good benchmark for a disk utility tool? Ha, was wondering about that too. I suppose you *could* create a bunch of files on the fly, and fill the files with random data. That way, you don't have to check it into git (which would be ugly and wasteful, especially for future contributors). You could put them straight into `/tmp`, as to only test the code itself, and not be beaten by disk specific properties... but I'm not sure whether that'd make sense.
Author
Owner

@polyzen commented on GitHub (Oct 9, 2018):

Resolved?

<!-- gh-comment-id:427983984 --> @polyzen commented on GitHub (Oct 9, 2018): Resolved?
Author
Owner

@bootandy commented on GitHub (Oct 10, 2018):

The rust port of du is slower than the original du.

I think fully resolving this would be a deep dive into the differences between c and rust. I do not know enough about these languages to be comfortable doing this.

I am content to find that the rust port of du is slower than the c version of du in a similar way that dust is slower. Hence I conclude I am not doing something really stupid.

<!-- gh-comment-id:428376196 --> @bootandy commented on GitHub (Oct 10, 2018): > The rust port of du is slower than the original du. I think fully resolving this would be a deep dive into the differences between c and rust. I do not know enough about these languages to be comfortable doing this. I am content to find that the rust port of du is slower than the c version of du in a similar way that dust is slower. Hence I conclude I am not doing something really stupid.
Author
Owner

@tamerh commented on GitHub (Jan 27, 2020):

@bootandy thanks for this library. How did you measure the performance you mentioned in README? I mean which commands you used and number of files you used?

<!-- gh-comment-id:578695766 --> @tamerh commented on GitHub (Jan 27, 2020): @bootandy thanks for this library. How did you measure the performance you mentioned in README? I mean which commands you used and number of files you used?
Author
Owner

@bootandy commented on GitHub (Feb 3, 2020):

I actually didn't write the bit in the performance. That was updated when we added rayon and parallelization.

I can't reproduce and / or prove it.

I think we should remove the section on performance.

<!-- gh-comment-id:581593568 --> @bootandy commented on GitHub (Feb 3, 2020): I actually didn't write the bit in the performance. That was updated when we added rayon and parallelization. I can't reproduce and / or prove it. I think we should remove the section on performance.
Author
Owner

@Arcterus commented on GitHub (Jun 12, 2020):

You’ve probably figured this out in the past couple years, but FYI the Rust version of du is slower than the C version simply because it is rather poorly optimized. I have a bit rotted WIP PR that is much faster (but as it was WIP still needs work to function totally correctly). It was even faster than C du in many cases IIRC.

<!-- gh-comment-id:643226889 --> @Arcterus commented on GitHub (Jun 12, 2020): You’ve probably figured this out in the past couple years, but FYI the Rust version of `du` is slower than the C version simply because it is rather poorly optimized. I have a bit rotted WIP PR that is *much* faster (but as it was WIP still needs work to function totally correctly). It was even faster than C `du` in many cases IIRC.
Author
Owner

@bootandy commented on GitHub (Jun 15, 2020):

https://github.com/uutils/coreutils/pull/1379
Was that this PR ?

<!-- gh-comment-id:644006139 --> @bootandy commented on GitHub (Jun 15, 2020): https://github.com/uutils/coreutils/pull/1379 Was that this PR ?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: bootandy/archived-dust#9