[Private Instance AMI] - Run gets assigned but never happens

Hi All,

I’ve been having a go at setting up a WebPageTest private AMI instance to provide some performance bench-marking for our website.

I’ve got it running, however the tests sometimes don’t run. When they don’t run the following happens:

Timeline:

[list]
[]I make an API call
[
]The controlling machine boots up a test instance
[]I continue to ping the API every 20 seconds for completion and it keeps recieving a 100 status
[
]At some point in this process the test machine terminates (I checked in 49 minutes later), the test is still showing a 100 status
[*]At some point WPT times out the test and forces it to complete.
[/list]

Logs

Individual test log:

2018/03/05 19:16:27 - Test Created
2018/03/05 19:22:06 - Starting test (initiated by tester i-0e3931ee461b73553)
2018/03/05 19:22:06 - Run 1 assigned to i-0e3931ee461b73553
2018/03/05 20:35:22 - Test run 1 has been running for 73 minutes on i-0e3931ee461b73553, forcing done.
2018/03/05 20:35:22 - Test has been running for 73 minutes and it has been 100 since the last update, forcing the full test to finish.

www/logs/YYYYMMDD.log:

This shows the test run, but I think it looks relatively innocuous.

2018-03-05 19:16:27     {ip_address}     0       0       180305_TP_4     https://example.com  US East (N. Virginia) - <b>Chrome</b> - <b>Cable</b>    0                       1               {unique_key1}        {unique_key2}        1

My set-up

[list]
[] webpagetest-server-2014-11-25 (ami-9978f6ee)
[
] controller instance on t2.micro - EU Ireland
[] test instances haven’t touched, so are running on m3.medium that the AMI scales by default
[
] I have slow_test_time=200 in the settings, but I’m unsure where that logs to.
[/list]

How can I debug this issue? I’m struggling with where to go next.

Ok so I’ve managed to debug this, however I’m not sure why it happens.

I wanted to only have my WebPageTest instance available on HTTPS and so was redirecting all HTTP traffic to HTTPS with a 301.

However this seems to cause the test instances to be unable to talk back to the master instance and to instead hang indefinitely. Unsure why, but opening up port 80 and allowing traffic on both HTTPS and HTTP made this work (although obviously not ideal).