Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory usage exceeds expectations! #20506

Closed
biuboombiuboom opened this issue May 16, 2024 · 0 comments
Closed

Memory usage exceeds expectations! #20506

biuboombiuboom opened this issue May 16, 2024 · 0 comments

Comments

@biuboombiuboom
Copy link
Contributor

biuboombiuboom commented May 16, 2024

I am using vector to consume messages from Kafka and write them to ClickHouse.

The buffer size is 40k, and the batch size is 80k.

800k kafka message size is approximately 700MB, and the converted ClickHouse request size is around 1GB.
But the machine monitoring shows peak memory usage of 30GB, which is about double the expected usage.

config

[sources.in]
session_timeout_ms = 120000
type = "kafka"
bootstrap_servers = "#"
group_id = "flume-clickhouse-event"
topics = ["#"]
librdkafka_options = { "auto.offset.reset" = "latest", "fetch.min.bytes" = "10240" }

[transforms.balance]
type = "remap"
inputs = ["in"]
drop_on_error = true
source = '.route_id=random_int(0,3)'

[transforms.route]
type = "route"
inputs = ["balance"]
reroute_unmatched = false
route.first = '.route_id==0'
route.second = '.route_id==1'
route.third = '.route_id==2'


[sinks.out-clickhouse-1]
type = "clickhouse"
inputs = ["route.first"]
compression = "none"
endpoint = "http://localhost:8125"
database = "mytable"
batch.max_events = 800000
batch.max_bytes = 16106127360
batch.timeout_secs = 120
request.timeout_secs = 120
request.adaptive_concurrency.initial_concurrency=4
# request.adaptive_concurrency.max_concurrency_limit=4
buffer.type = "memory"
buffer.max_events = 400000
buffer.when_full = "block"

[sinks.out-clickhouse-2]
type = "clickhouse"
inputs = ["route.second"]
compression = "none"
endpoint = "http://localhost:8125"
database = "mytable"

batch.max_events = 800000
batch.max_bytes = 16106127360
batch.timeout_secs = 120

request.timeout_secs = 120
request.adaptive_concurrency.initial_concurrency=4
# request.adaptive_concurrency.max_concurrency_limit=4

buffer.type = "memory"
buffer.max_events = 400000
buffer.when_full = "block"

[sinks.out-clickhouse-3]
type = "clickhouse"
inputs = ["route.third"]
compression = "none"
endpoint = "http://localhost:8125"
database = "mytable"
batch.max_events = 800000
batch.max_bytes = 16106127360
batch.timeout_secs = 120
request.timeout_secs = 120
request.adaptive_concurrency.initial_concurrency=4
buffer.type = "memory"
buffer.max_events = 400000
buffer.when_full = "block"

memory monitor

image
@vectordotdev vectordotdev locked and limited conversation to collaborators May 20, 2024
@jszwedko jszwedko converted this issue into discussion #20533 May 20, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant