Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Virtual file system #1184

Open
1 task done
JinHai-CN opened this issue May 7, 2024 · 2 comments
Open
1 task done

[Feature Request]: Virtual file system #1184

JinHai-CN opened this issue May 7, 2024 · 2 comments
Labels
feature request New feature or request

Comments

@JinHai-CN
Copy link
Contributor

Is there an existing issue for the same feature request?

  • I have checked the existing issues.

Is your feature request related to a problem?

No response

Describe the feature you'd like

Infinity's internal data is consists of segments and blocks, where each block is made up of a bunch of block columns. The implementation is such that each block column is persisted as a file on disk, no matter how large that file is. This can result in a large number of files in a single table. This feature request aims to solve this problem. We use a virtual filesystem serving infinity, whereas in reality several block column files exist on a single actual file, which avoids the problem of creating a large number of files and alleviates the possibility of a 'too many open' files error.

Describe implementation you've considered

No response

Documentation, adoption, use case

No response

Additional information

No response

@JinHai-CN JinHai-CN added the feature request New feature or request label May 7, 2024
@JinHai-CN JinHai-CN mentioned this issue May 7, 2024
33 tasks
@JinHai-CN
Copy link
Contributor Author

The goal of the virtual file system is to have a virtual layer where each generated block column, index file, delete file, etc. can be stored by the VFS. Through this layer, infinity can be connected to the local file system, can also be connected to the file system like s3.

Therefore, virtual file system needs to provide the following interfaces:
Open/Read/Write/Seek/Truncate/Close.

In the concrete implementation, VFS needs a metadata store: provide the mapping relationship between physical files and virtual file blocks, also provide the virtual file data contained in which virtual file blocks. For metadata reading and writing, what we see now is mainly accessed in the form of key value. Therefore, metadata storage can be considered kv store.

The size of each file block should be a fixed size, for example, 64KB. A physical file, isn't a fixed size files. But its size should be fixed in, for example, between 16 and 24MB.

With the constant creation and deletion of files, there must be a large amount of file fragments in the original file that needs to be cleaned up. Considering that s3 will be used as the actual storage, this layer of virtual file system, for the use of physical storage, should be append-only. The fragments merging and cleanup operation logs should be kept by the WAL of the database like create/delete/update/write operations of the VFS.

@JinHai-CN
Copy link
Contributor Author

Considering the complexity of this feature and the main goal of 0.6.0, we decided to move this feature out of the 0.6.0 scope, now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant