Inconsistent Timeout Error

I will sometimes get the error:
The testing completed but failed.
Timed out waiting for the browser to start.

I cannot duplicate it consistently, it happens pretty rarely.

Would these be good ways to help prevent the error?

  1. Increasing timeout. How can I do that via Rest API?
  2. Increase max retries. Is the Rest API parameter just “retry”?
  • I see this is in version 2.9. Does the EC2 server AMI use this version?

Or is there a better way to fix/prevent this error?

Thanks.

2.9 or 2.19? 2.9 is pretty ancient.

That usually happens when the browser is newer than what wptdriver supports and updating to a more recent wptdriver should fix it: https://sites.google.com/a/webpagetest.org/docs/private-instances#TOC-Updating-Test-Agents

Oh, I just stumbled across this link that mentioned retries: https://sites.google.com/a/webpagetest.org/docs/private-instances/releases/webpagetest-2-9. So it is definitely supported then.

I am using the server AMI, it updates the nodes automatically correct?

Sorry, I thought you had installed 2.9 manually. I don’t think the retry logic is hooked up anymore. The EC2 AMI’s track trunk (newer than 2.19) and auto-update the agents (and server).

You can check the agent version by going to http:///getTesters.php (or looking at the PTST/ in the user agent string in any of the requests in the waterfall). The current build is 276.

If for some reason your server got stuck and hasn’t pulled down the latest agent binary you can force it by going to http:///cron/hourly.php

Thank you for the info, all agents look correct: 2.19.0.276
As it is rare, could I increase the test timeout?

EDIT:
I do not know if it is a pattern, but today the error happened on the first test, so the agent had to boot up. Does the test consider startup time as part of the test time?

No, the startup time shouldn’t be included as part of the test time so increasing the timeout wouldn’t help. What size instances are you using? It’s a little scary that it is taking the browser more than 60 seconds to start up the first time. I could update the agent but it would be better to get to the source of the issue: https://github.com/WPO-Foundation/webpagetest/blob/master/agent/wptdriver/web_browser.cc#L229

Server - t2.small
Agents - Used to be m3.medium as default. I changed that to t2.medium this afternoon. I’ve only ran a few tests since changing it so I don’t have any timeouts yet.

Note: I switched to t2.medium because m3.medium was consistently doubling speed index. The m3.medium seemed too weak. They are both consistent, but my pre-server-AMI private server used t2.medium agents so I wanted to keep the same so the old results can still be compared.