Crash detection

How are people detecting if a test agent is up and still running?

I’ve thought about having a script that will queue up a test periodically and then raise an alert if the test isn’t completed within a specified period. But this strikes me as an impractical way of doing this as there could be many tests that are currently in the queue.

I ask because in the last few days I’ve had the webpagetest driver and url blaster crash for unknown reasons and I’d like some sort of alert when this happens.

You can take a look at the url:
http://www.webpagetest.org/getTesters.php

This shows when a agent is has visited the ‘mother’.

Rob

Yeah, what he said :slight_smile:

There is also a checktesters.php script in the work directory that you can call from cron that will send an email for any locations that haven’t connected to the server in the last hour (all agents for that location have died).

It may also be worthwhile to grab the latest agent binaries. There have been a couple of crash fixes (usually in the optimization checks) but I also added a watchdog process that automatically restarts the agents if they stopped running but didn’t exit on purpose.

The binaries are available here:
http://www.webpagetest.org/work/update/update.zip
http://www.webpagetest.org/work/update/wptupdate.zip

If you extract the ini file from each and then drop all 4 files into your work/update directory the agents will update themselves.