Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number and weight of chrome instances can grow large #54

Open
nathanl opened this issue Apr 28, 2022 · 6 comments
Open

Number and weight of chrome instances can grow large #54

nathanl opened this issue Apr 28, 2022 · 6 comments
Labels
help wanted Extra attention is needed

Comments

@nathanl
Copy link
Contributor

nathanl commented Apr 28, 2022

We have had chroxy running continuously on a machine for a long time - weeks or months.
We noticed that the machine was getting low on memory.

Running ps -o pid,user,%mem,rss,lim,command --sort %mem ax | less (on Linux - the equivalent on MacOS is ps -o pid,user,%mem,rss,lim,command -me) showed that the processes consuming most of the memory were chrome instances.

Restarting chroxy dramatically reduced both the number and weight of the chrome processes.
Before restart, there were 52 chrome processes using 4.4GB of memory.
After restart, there were 11 chrome processes using 666MB of memory.

I don't know a way to reproduce this situation and, in our case, it's reasonable to restart chroxy periodically during known periods of low activity. But I thought it was worth reporting.

@nathanl
Copy link
Contributor Author

nathanl commented May 2, 2022

4 days later, there are 27 chrome processes using 2.03GB of memory.

@nathanl
Copy link
Contributor Author

nathanl commented May 9, 2022

I'm experimenting with having the application which uses Chroxy (we only have one) to do the following on a daily timer:

  • Pause sending work to chroxy
  • Restart the chroxy processes which manage the headless chrome OS processes
  • Resume sending work to chroxy

This seems to work well in local testing, but it hasn't shipped to production yet. The "pause and resume" strategy means that nothing is dropped, even if there is activity in the system at that time.

To support restarting, I modified our private fork of Chroxy as follows:

# in Chroxy.ChromeServer.Supervisor

# Ongoing requests could be dropped; the only way to prevent that is to ensure,
# from outside chroxy, that there are no ongoing requests when the reset
# happens.
def restart_all() do
  DynamicSupervisor.which_children(@sup)
  |> Enum.each(fn {_, pid, _, _} ->
    Process.exit(pid, :memory_management)
  end)
end

and

# in Chroxy.Endpoint

# Note: there is no authentication
post "/restart" do
  Chroxy.ChromeServer.Supervisor.restart_all()
  send_resp(conn, 200, "restarting")
end

In the consuming application, I periodically run something like this:

:ok = ChroxyWorker.pause()

case HTTPoison.request(:post, "http://#{chroxy_host}:#{chroxy_port}/restart") do
  {:ok, _response} ->
    Logger.debug("Chroxy responded successfully to restart request")

    # Wait for its chrome processes to come back up. Amount of time is imprecise
    # but based on observation.
    Process.sleep(5_000)

  {:error, error} ->
    Logger.warn("Chroxy did not respond successfully to restart request: #{inspect(error)}")
end

:ok = ChroxyWorker.resume()

@nathanl
Copy link
Contributor Author

nathanl commented May 17, 2022

This strategy appears to be working well in our deployed environment to keep the number of processes and the memory usage from growing out of control.

@nathanl
Copy link
Contributor Author

nathanl commented Jun 13, 2022

Update: even though the "restart by request" succeeded last night, the number and weight of chrome processes was large today. Either this strategy isn't working after all, or restarting once a day isn't often enough for us.

@nathanl
Copy link
Contributor Author

nathanl commented Aug 2, 2022

Update: we haven't seen issues lately here, so perhaps the restart is good enough as-is.

@holsee holsee added the help wanted Extra attention is needed label Jan 19, 2023
@holsee
Copy link
Owner

holsee commented Jan 19, 2023

Chrome is an interesting beast when it comes to memory, it's a thirsty boy.

Automatic recycling of chrome instances internally could be a solution, having a the way connections and requests are made prefer, say, newer chrome processes in the pool and killing the oldest on a configurable cycle.

Another option could be to monitor memory use of the process itself and once it hits a threshold mark it for recycling.

If you're interested in something like this, I could take a look when the time presents itself otherwise any upstream PRs are always greatly appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants