Performance Degradation During Load Testing with Resource-Constrained Tasks (Playwright & Fargate) #2755

joshuscurtis · 2024-05-15T15:45:58Z

joshuscurtis
May 15, 2024

We are conducting a stepped load test using Artillery to simulate approximately 4,000 concurrent users. The test configuration involves ramping up from 1 to 10 virtual users (VUs) in 10-minute steps, using the largest ECS task configuration.

However, we've noticed that as the number of VUs scales beyond about 4, we start to see performance degradation in the metrics being monitored. We suspect that this degradation is due to resource constraints, as each task has limited resources available.

One potential solution we've considered is to create 4,000 individual tasks, each with 1 vCPU allocated. However, this approach becomes expensive and resource-intensive from an infrastructure perspective. It also would mean we would be unable to ramp up the load - @hassy is there a workaround for this, or is there are any relevant features or capabilities within Artillery that could help address this problem, we would appreciate if you could highlight them.

We would appreciate any guidance or recommendations from the Artillery team and community on how to best handle this issue. Specifically, we're looking for strategies or techniques to effectively simulate many resource-constrained users without compromising the accuracy of the load test or incurring significant infrastructure costs.

Thank you in advance for your assistance! :)

hassy · 2024-05-15T22:11:19Z

hassy
May 15, 2024
Maintainer

Resource contention inside the same Fargate task can definitely start happening when the number of concurrent VUs running in that task crosses a threshold. The exact number depends on the app you're testing and what the tests themselves are doing.

Are you running your tests on tasks with 16 vCPUs right now? How much memory are you allocating to those tasks? I'd normally expect more than 4 concurrent VUs to run per task, unless the app is unusually memory-heavy or CPU-intensive.

From the cost perspective, if you were to allocate 4k vCPUs and 4k GB of RAM, you'd be looking at: 4000 * ($0.012144 + $0.0013335) = $53.91/hour (numbers from https://aws.amazon.com/fargate/pricing/)

One way to reduce the amount of resources each VU takes is to not set useSeparateBrowserPerVU - in case you're setting it. https://www.artillery.io/docs/reference/engines/playwright#configuration

7 replies

hassy May 16, 2024
Maintainer

Sorry I was using Fargate Spot numbers in my calculations. Artillery supports running on Fargate Spot, you just need to add the --spot flag. But it's not a great option for longer tests, so your calculations will be more representative of cost. And indeed, a 16 vCPU task requires at least 32 GB of RAM so the total figure is slightly higher. (https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html#task_size)

Can you describe the ramp you want to achieve? Are you looking increase the number of concurrent VUs by 10 every 10 minutes until you reach 4000?

joshuscurtis May 17, 2024
Author

No worries!
How long of a test do you think is viable using Fargate Spot?

So what I'd like to be able to do is this:
Task config: 1vCPU & 2GB RAM (with a single VU in each)

Phases:

666 tasks for 10 minutes
1333 tasks for 10 minutes
1999 tasks for 10 minutes
2666 tasks for 10 minutes
3333 tasks for 10 minutes
4000 tasks for 1 minutes

But still be able to have a variety of scenarios with different weights etc.

joshuscurtis May 17, 2024
Author

I have a idea that we could use parallel clusters in order to generate load in this manner:

All theory, but I'd of thought a bit of a wrapper around artillery, would then allow this to be coordinated. Do you think this would be feasible?

hassy May 17, 2024
Maintainer

Deciding whether to run run on Fargate Spot vs regular Fargate is mostly to do how much unpredictability you can tolerate, for 2 main reasons:

Sometimes there's just not enough spot capacity in a region, so your tests won't start. So sometimes you'll have to wait and try again, or fallback to using regular Fargate. The larger your test is (in terms of requested vCPUs / memory) the more likely you run into this.
A spot task can be pre-empted and terminated at any time with a 30s notice. The longer your test is the more likely that is. Your test would still keep running but with reduced load, depending on how many tasks got terminated.

I think the next step is to figure out how many concurrent VUs a single task can sustain. I'd start with a single 16 vCPU / 32 GB RAM task, and play around with different combinations of arrivalRate and maxVusers to find a value at which monitored metrics are stable. That would give you the total number of tasks you'll need to run. For example if a single task can run 20 concurrent VUs, you'll need 200 of them for 4,000 concurrent VUs at the end of the test.

You can then work backwards to create your ramp definitions. Something like:

666 VUs in the first phase: 666/200 = 3.33 so either 3 or 4 maxVusers (3 or 4 depending on whether 600 or 800 is closer to what you want)
1333 concurrent VUs in the second phase: 6 or 8 maxVusers

and all the way up to the last phase where you'll specify 20 maxVusers.

It's not as precise as your numbers are, but hopefully an approximation like that works. The way Artillery works right now is that all tasks have to be started at the beginning of the test. It does not support adjusting the pool of tasks dynamically. It's something we want to add but no ETA yet.

hassy May 17, 2024
Maintainer

Parallel clusters is an interesting idea and writing a small wrapper should be easy enough. A big drawback though is that you'd get multiple reports back rather than a single merged one.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Degradation During Load Testing with Resource-Constrained Tasks (Playwright & Fargate) #2755

{{title}}

Replies: 1 comment 7 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Performance Degradation During Load Testing with Resource-Constrained Tasks (Playwright & Fargate) #2755

joshuscurtis May 15, 2024

Replies: 1 comment · 7 replies

hassy May 15, 2024 Maintainer

hassy May 16, 2024 Maintainer

joshuscurtis May 17, 2024 Author

joshuscurtis May 17, 2024 Author

hassy May 17, 2024 Maintainer

hassy May 17, 2024 Maintainer

joshuscurtis
May 15, 2024

Replies: 1 comment 7 replies

hassy
May 15, 2024
Maintainer

hassy May 16, 2024
Maintainer

joshuscurtis May 17, 2024
Author

joshuscurtis May 17, 2024
Author

hassy May 17, 2024
Maintainer

hassy May 17, 2024
Maintainer