I have a private WPT instance in AWS (m4.large) on which I enqueue ~800 tests, twice a day.
Over the past two days (from ~26 Feb) it has started failing after the first 300 enqueued with an error in nginx saying “Too many files open”; this also means the test results for those 300 cannot be submitted by the agents to the server.
I’m investigating nginx config changes, but knowing that the server hasn’t been touched (other than restarted once this morning to see if that would help) means this is confusing.
The limits for all ngnix processes are:
Max open files
I went through this process to whack the limits up to something ridiculous (soft 250000, hard 300000), and the nginx error changed to “upstream prematurely closed connection while reading response header from upstream”, however I’m now able to enqueue over 1000 tests it seems.
Has anyone else had this issue and has a better fix than my hack?..