Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add settings local_filesystem_write_method #64055

Open
tomershafir opened this issue May 17, 2024 · 2 comments
Open

Add settings local_filesystem_write_method #64055

tomershafir opened this issue May 17, 2024 · 2 comments
Labels
feature st-discussion The story requires discussion /research / expert help / design & decomposition before will be taken

Comments

@tomershafir
Copy link
Contributor

tomershafir commented May 17, 2024

Similarly to the settings local_filesystem_read_method, add settings local_filesystem_write_method to enable users to configure a write method, and tradeoff latency and throughput.

Describe the solution you'd like

Start with methods: write, pwrite, io_uring. Create interface AsynchronousWriter and provide implementation foreach method. Compose file based write buffers with a writer instance via factory method. Add core settings local_filesystem_write_method.

Let me know wether it makes sense and should I work on that.

@alexey-milovidov
Copy link
Member

I doubt it makes sense. Writes happen in much larger and sequential pieces.

@alexey-milovidov alexey-milovidov added the st-discussion The story requires discussion /research / expert help / design & decomposition before will be taken label May 20, 2024
@tomershafir
Copy link
Contributor Author

tomershafir commented May 21, 2024

I think its more complex.

Lets assume a basic candidate for large sequential writes: a single fact table, with day/hour based partitioning, where clients batch inserts. Here are some factors that can decrease sequentiality:

  • Concurrent random reads. Writes are not in isolation.
  • Multi column writes.
  • Multi file writes (bin, mrk, idx).
  • Concurrent merges and TTL.
  • Interrupts.
  • Small batches, multiple 1Mib buffered write() calls, and long inserts can increase interleaving and decrease sequentiality.

Further considerations:

  • More complex scheme with dimension tables and MVs.
  • Concurrent multi partition inserts (e.g. when a special partition key is used).
  • async_insert concurrent writes.
  • Large scale cloud native replica set of writers can result in higher RPS.
  • Other apps on the machine.

It depends on Linux kernel and devices, of course.

Given the above, for io_uring I think it can make sense to unify reads and writes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature st-discussion The story requires discussion /research / expert help / design & decomposition before will be taken
Projects
None yet
Development

No branches or pull requests

2 participants