[GH-ISSUE #387] PROPFIND method support infinity depth #202

Closed
opened 2026-04-08 16:51:08 +03:00 by zhus · 6 comments
Owner

Originally created by @xiaozhuai on GitHub (May 16, 2024).
Original GitHub issue: https://github.com/sigoden/dufs/issues/387

Specific Demand

PROPFIND methid can provide a depth parameter in the header, it accept 0, 1, infinity:

  • 0: Retrieve properties of the directory
  • 1: as 0 + properties of all files in the directory
  • infinity: as 1 + properties of all files in sub-directories of the directory (recursively)

Please support infinity depth.

Originally created by @xiaozhuai on GitHub (May 16, 2024). Original GitHub issue: https://github.com/sigoden/dufs/issues/387 ## Specific Demand PROPFIND methid can provide a `depth` parameter in the header, it accept `0`, `1`, `infinity`: * 0: Retrieve properties of the directory * 1: as 0 + properties of all files in the directory * infinity: as 1 + properties of all files in sub-directories of the directory (recursively) Please support infinity depth.
zhus closed this issue 2026-04-08 16:51:08 +03:00
Author
Owner

@sigoden commented on GitHub (May 16, 2024):

Are there any common usage scenarios for this feature?
dufs is a lightweight web server and does not promise to support all webdav specifications (for example, it does not support lock/unlock).
This feature can have a significant impact on server performance.
We are not prepared to support if it is not necessary. hope you can understand.

<!-- gh-comment-id:2114605752 --> @sigoden commented on GitHub (May 16, 2024): Are there any common usage scenarios for this feature? dufs is a lightweight web server and does not promise to support all webdav specifications (for example, it does not support lock/unlock). This feature can have a significant impact on server performance. We are not prepared to support if it is not necessary. hope you can understand.
Author
Owner

@xiaozhuai commented on GitHub (May 16, 2024):

Are there any common usage scenarios for this feature?

I use dufs as cache server for vcpkg. But as time goes by, more and more files will be cached, taking up more and more space. So I need to scan all cache files regularly and remove expired files.
This requires being able to list all files in a directory recursively.

Of course, I could propfind one directory at a time and call it recursively until I get all the files.
But the problem is that there are a large number of requests between the client and the server.
If infinity depth is supported, the result can be obtained in one request.

This feature can have a significant impact on server performance.
We are not prepared to support if it is not necessary.

I understand. I wonder if we could make this an optional allow-propfind-infinity feature?
This feature is actually very important for those who need to list files recursively.

<!-- gh-comment-id:2114639024 --> @xiaozhuai commented on GitHub (May 16, 2024): > Are there any common usage scenarios for this feature? I use dufs as cache server for vcpkg. But as time goes by, more and more files will be cached, taking up more and more space. So I need to scan all cache files regularly and remove expired files. This requires being able to list all files in a directory recursively. Of course, I could propfind one directory at a time and call it recursively until I get all the files. But the problem is that there are a large number of requests between the client and the server. If infinity depth is supported, the result can be obtained in one request. > This feature can have a significant impact on server performance. We are not prepared to support if it is not necessary. I understand. I wonder if we could make this an optional `allow-propfind-infinity` feature? This feature is actually very important for those who need to list files recursively.
Author
Owner

@sigoden commented on GitHub (May 16, 2024):

Why I do not support this feature?

  1. PROPFIND does not support stream output. All results are cached in the memory before being output. Will the memory explode?
  2. Dufs is stateless, unlike other services that have databases. To generate a list of all files, dufs have to traverse all the files recursively, How abount the the performance?
  3. A configuration item has a learning cost, which increases the burden on other users who do not need it at all.
  4. The usage scenario is niche

Suggestion

You can write a python script to generate a file list of all subdirectories in a directory and export it as manifest.json. Set up the appropriate crontab task according to your needs. Create a special directory in dufs to serve the manifest.json, and all problems will be solved.

Ask gpt to write the script:

import os
import json

def generate_file_list(directory):
  """
  Generates a list of files in all subdirectories of a given directory
  and exports it in JSON format.

  Args:
    directory: The path to the directory to list files in.

  Returns:
    A JSON string containing the file list.
  """
  file_list = {}
  for root, dirs, files in os.walk(directory):
    for file in files:
      file_path = os.path.join(root, file)
      relative_path = os.path.relpath(file_path, directory)
      file_list[relative_path] = {
        "size": os.path.getsize(file_path),
        "modified": os.path.getmtime(file_path)
      }
  return json.dumps(file_list, indent=2)

if __name__ == "__main__":
  directory_path = "/path/to/your/directory"  # Replace this with your directory path
  json_output = generate_file_list(directory_path)
  print(json_output)

@xiaozhuai

<!-- gh-comment-id:2114710712 --> @sigoden commented on GitHub (May 16, 2024): ### Why I do not support this feature? 1. PROPFIND does not support stream output. All results are cached in the memory before being output. Will the memory explode? 2. Dufs is stateless, unlike other services that have databases. To generate a list of all files, dufs have to traverse all the files recursively, How abount the the performance? 3. A configuration item has a learning cost, which increases the burden on other users who do not need it at all. 4. **The usage scenario is niche** ### Suggestion You can write a python script to generate a file list of all subdirectories in a directory and export it as `manifest.json`. Set up the appropriate crontab task according to your needs. Create a special directory in dufs to serve the `manifest.json`, and all problems will be solved. Ask gpt to write the script: ```py import os import json def generate_file_list(directory): """ Generates a list of files in all subdirectories of a given directory and exports it in JSON format. Args: directory: The path to the directory to list files in. Returns: A JSON string containing the file list. """ file_list = {} for root, dirs, files in os.walk(directory): for file in files: file_path = os.path.join(root, file) relative_path = os.path.relpath(file_path, directory) file_list[relative_path] = { "size": os.path.getsize(file_path), "modified": os.path.getmtime(file_path) } return json.dumps(file_list, indent=2) if __name__ == "__main__": directory_path = "/path/to/your/directory" # Replace this with your directory path json_output = generate_file_list(directory_path) print(json_output) ``` @xiaozhuai
Author
Owner

@xiaozhuai commented on GitHub (May 16, 2024):

I switch to nginx, it also didn't support propfind infinity.
And I've update my script, now it support list files in parallel.
Total file count is 1435.
Here is a benchmark result. The parallel num is 1, 4, 8, 32.

Under dufs, increasing parallel num does not give good gains, but the nginx did.

And when the parallel num reaches a certain value(>= 8), the performance drops sharply.

For nginx, although there will be no significant benefit after the parallel num is greater than 8, there will be no performance degradation.

nginx

image

dufs

image
<!-- gh-comment-id:2114823213 --> @xiaozhuai commented on GitHub (May 16, 2024): I switch to nginx, it also didn't support propfind infinity. And I've update my script, now it support list files in parallel. Total file count is 1435. Here is a benchmark result. The parallel num is 1, 4, 8, 32. Under dufs, increasing parallel num does not give good gains, but the nginx did. And when the parallel num reaches a certain value(>= 8), the performance drops sharply. For nginx, although there will be no significant benefit after the parallel num is greater than 8, there will be no performance degradation. ## nginx <img width="571" alt="image" src="https://github.com/sigoden/dufs/assets/4773701/a07b685d-d996-407d-a23d-6ad74982c0ee"> ## dufs <img width="570" alt="image" src="https://github.com/sigoden/dufs/assets/4773701/032d686d-71d8-4629-a95a-7c7976a09955">
Author
Owner

@xiaozhuai commented on GitHub (May 16, 2024):

Here is the code snippets I use to get files recursively

    async readdir(dir) {
        const files = await this.client.getDirectoryContents(dir); // Perform PROPFIND request with depth = 1
        const subFiles = await Throttle.all(files
            .filter(file => file.type === 'directory')
            .map(file => async () => {
                return await this.readdir(file.filename);
            }), {maxInProgress: 8}); // parallel num
        return files.concat(...subFiles);
    }
<!-- gh-comment-id:2114860394 --> @xiaozhuai commented on GitHub (May 16, 2024): Here is the code snippets I use to get files recursively ```js async readdir(dir) { const files = await this.client.getDirectoryContents(dir); // Perform PROPFIND request with depth = 1 const subFiles = await Throttle.all(files .filter(file => file.type === 'directory') .map(file => async () => { return await this.readdir(file.filename); }), {maxInProgress: 8}); // parallel num return files.concat(...subFiles); } ```
Author
Owner

@sigoden commented on GitHub (May 16, 2024):

I will research the performance issues later.

We do not currently plan to support infinite depth, so we are closing this issue.

<!-- gh-comment-id:2115370536 --> @sigoden commented on GitHub (May 16, 2024): I will research the performance issues later. We do not currently plan to support infinite depth, so we are closing this issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: sigoden/dufs#202