mirror of
https://github.com/bootandy/dust.git
synced 2026-06-08 11:29:05 +03:00
[PR #575] perf(walk): improvements to the walker hot path #560
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/bootandy/dust/pull/575
Author: @arcuru
Created: 5/12/2026
Status: 🔄 Open
Base:
master← Head:walker-perf📝 Commits (6)
aa433aeperf(walk): flatten recursion via rayon::scope, drop -S stack-size knobecdc8efperf(walk): cache per-dir stat to eliminate duplicate syscall24aff08perf(walk): cache entry path and file_type in process_entry242f722perf(walk): trim filter-path overhead and skip ignore_file when off44c4c3fperf(walk): swap FxHash for inode dedup HashSet81cf4ebperf(walk): skip per-file statx in --filecount mode📊 Changes
15 files changed (+826 additions, -478 deletions)
View changed files
📝
Cargo.lock(+16 -141)📝
Cargo.toml(+1 -1)📝
README.md(+1 -1)📝
completions/_dust(+2 -2)📝
completions/_dust.ps1(+2 -2)📝
completions/dust.elv(+2 -2)📝
completions/dust.fish(+1 -1)📝
man-page/dust.1(+1 -1)📝
src/cli.rs(+2 -3)📝
src/config.rs(+0 -9)📝
src/dir_walker.rs(+584 -140)📝
src/main.rs(+27 -52)📝
src/node.rs(+62 -43)📝
src/platform.rs(+98 -80)📝
tests/tests_symlinks.rs(+27 -0)📄 Description
More changes as a stack on top of my previous #574. Ideally this would be stacked commits but right now this will show the prev commit inside this PR...
Happy to split/rebase as requested, though I am also fine if you pull/modify yourself.
Five small reductions on the per-entry walker hot path. None of them change observable behavior they just remove redundant work. They compound for noticeable wins on tree-heavy and wide-directory walks.
Per-dir stat cache.
walk_dir()was statting each directory for itsis_dir()check, thenbuild_node()statted it again to populate the Node. Cache the parsed metadata tuple inPendingDirviaOnceLock. Onestatxsaved per directory.Cache
entry.path()/entry.file_type(). Both were called 2-3× per entry, andentry.path()allocates a freshPathBufeach call. Compute once inprocess_entryand thread through. Also gateis_ignored_pathon a cheap empty-check so walks without--ignore-directoryskip the HashSet probe entirely.Skip
ignore_fileon default walks. Precomputehas_any_filterwhen constructingWalkData. In the hot loop, three-way branch: fullignore_filewhen filters are active, inlined dot-file check when only--ignore-hiddenis on, otherwise skip. When filters are active, also coalesce the duplicateget_metadatacalls insideignore_file, replacepath.is_file()stats with the cachedfile_type, hoistfs::canonicalizeout of theignore_directoriesloop, and thread the prefetched tuple intobuild_nodeso each entry is statted at most once on filter-active walks.FxHash for inode dedup.
clean_inodeshashes 25M(inode, dev)pairs on my test/nix/store. SipHash's DoS resistance is pointless for primitive keys from our own syscalls. Addsrustc-hash = "2", swapsHashSetforFxHashSet. This can be inlined to avoid the dep by defining a simple hasher.Skip per-file
statxin--filecountmode. Under-f, every file's size is overwritten with 1, so the per-filestatxis removable. The only field the dedup still needs is(inode, dev): inode comes fromDirEntry::ino()(filled bygetdents64), dev from the parent directory's cached tuple. Gated to-fwithout-Land without metadata-needing filters, so output is byte-identical to the previous stat path. Unix-only; WindowsDirEntryhas no cheapd_inoanalogue.Benchmarks
Cumulative for this batch of changes. "Before" is inclusive of #574.
host: 24 CPU; hyperfine
--warmup 2 --runs 10for synthetics,--warmup 1 --runs 3-5for large targets-e "\.rs$"regex filter on dust src-xdust src-fbalanced-fwide_flat-f/nix/storeDefault-walk synthetics (
balanced,wide_flat,~) are tighter on this run. Those workloads are already close to the syscall floor; the per-entry CPU savings show up in user time more than wall time. I also sampled the system times for these runs, and they are more dramatic where they apply:-e-75% sys,-x-41% sys,-f /nix/store-61% sys. strace confirms-f wide_flatgoes from ~100kstatxcalls to ~10.Further Work
One more potential perf change stacked on top of this:
AT_STATX_DONT_SYNCon Linux. Biggest win on network filesystems (~3.8× on an 11 TB NFS mount in my testing); helps cold-cache local in smaller amounts. dust's defaultstd::fs::MetadatausesAT_STATX_SYNC_AS_STAT, which for a read-only walker is stricter coherence than we need.That is a small behavior change though, which may or may not be acceptable.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.