[PR #487] [MERGED] Fix miscalculation of NTFS mount file sizes inside WSL #510

Closed
opened 2026-06-08 11:28:57 +03:00 by zhus · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/bootandy/dust/pull/487
Author: @frendsick
Created: 4/19/2025
Status: Merged
Merged: 4/20/2025
Merged by: @bootandy

Base: masterHead: fix/wsl-ntfs-calculate-allocated-blocks


📝 Commits (6)

  • 7102f12 fix: Limit file size based on the file system I/O block size
  • 1b618c0 fix: Take possible file pre-allocation into account
  • de7eba1 refactor: Reduce indenting with early return
  • 07546e6 refactor: Fix clippy::manual_div_ceil
  • 1b92853 fix: Use target_size instead of max_size
  • 33ffc2b fix: Take possible pre-allocation for a file into account

📊 Changes

1 file changed (+26 additions, -9 deletions)

View changed files

📝 src/platform.rs (+26 -9)

📄 Description

The std::fs::Metadata struct sometimes contains incorrect information on how many blocks are allocated to a certain file on an NTFS mount. When that happens, the struct could have an unfathomably large block count that is unrelated to the file's real size.

The proposed fix determines how much space could be allocated to a certain file using md.blksize(). As far as I know, a file should only have up to one extra operating system block or page allocated for it, typically an extra 4 KB at most. The fixed code uses the same calculation as before by default (reported_size = md.blocks() * get_block_size()), but uses the max_size when it is smaller than the reported_size. The max_size should be smaller than the reported_size only when the metadata reports too many blocks or when the file system pre-allocates more data for a file than is necessary based on the OS page size.

Here is an example of how the current version of dust calculates that my C:\Users folder contains 9057 PB of data, whereas the proposed fixed version shows the correct amount of data, or at least very close to it.

$ dust /mnt/c/Users/ -b
...
9057P ┌─┴ Users
$ cargo run --release -- /mnt/c/Users/ -b
...
113G ┌─┴ Users

fixes #295


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/bootandy/dust/pull/487 **Author:** [@frendsick](https://github.com/frendsick) **Created:** 4/19/2025 **Status:** ✅ Merged **Merged:** 4/20/2025 **Merged by:** [@bootandy](https://github.com/bootandy) **Base:** `master` ← **Head:** `fix/wsl-ntfs-calculate-allocated-blocks` --- ### 📝 Commits (6) - [`7102f12`](https://github.com/bootandy/dust/commit/7102f120ad630a30a8fbf7b9b0f5f61ac485dfec) fix: Limit file size based on the file system I/O block size - [`1b618c0`](https://github.com/bootandy/dust/commit/1b618c04271b0eb7d3ed1e2cca7c8a4206bfe31a) fix: Take possible file pre-allocation into account - [`de7eba1`](https://github.com/bootandy/dust/commit/de7eba13fcbfe382cdf14d7ae535ea5c9cca4053) refactor: Reduce indenting with early return - [`07546e6`](https://github.com/bootandy/dust/commit/07546e62aa6163bf132a1a9397b2b312ca76539b) refactor: Fix clippy::manual_div_ceil - [`1b92853`](https://github.com/bootandy/dust/commit/1b928534fc6c8d795f3bbd27abd90411d8a9e86b) fix: Use target_size instead of max_size - [`33ffc2b`](https://github.com/bootandy/dust/commit/33ffc2b298c3aca456199a720e63d8a21fb1a4f3) fix: Take possible pre-allocation for a file into account ### 📊 Changes **1 file changed** (+26 additions, -9 deletions) <details> <summary>View changed files</summary> 📝 `src/platform.rs` (+26 -9) </details> ### 📄 Description The [std::fs::Metadata](https://doc.rust-lang.org/beta/std/fs/struct.Metadata.html) struct sometimes contains incorrect information on how many blocks are allocated to a certain file on an NTFS mount. When that happens, the struct could have an unfathomably large block count that is unrelated to the file's real size. The proposed fix determines how much space could be allocated to a certain file using `md.blksize()`. As far as I know, a file should only have up to one extra operating system block or page allocated for it, typically an extra 4 KB at most. The fixed code uses the same calculation as before by default (`reported_size = md.blocks() * get_block_size()`), but uses the `max_size` when it is smaller than the `reported_size`. The `max_size` should be smaller than the `reported_size` only when the metadata reports too many blocks or when the file system pre-allocates more data for a file than is necessary based on the OS page size. Here is an example of how the current version of `dust` calculates that my _C:\Users_ folder contains 9057 PB of data, whereas the proposed fixed version shows the correct amount of data, or at least very close to it. ```bash $ dust /mnt/c/Users/ -b ... 9057P ┌─┴ Users $ cargo run --release -- /mnt/c/Users/ -b ... 113G ┌─┴ Users ``` fixes #295 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
zhus added the pull-request label 2026-06-08 11:28:57 +03:00
zhus closed this issue 2026-06-08 11:28:58 +03:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: bootandy/archived-dust#510