I’ve written a script that kicks off a test, polls to wait for results, grabs the results and puts them into a python dataframe. At least that’s what it’s supposed to do.
My code is pretty messy so I apologies in advance, please feel free to suggest improvements if you have time but my priority is to get the script working.
When I send a GET to /testStatus.php, I am expecting a 1XX response as per the documentation, but it seems to return a 200 when the test is at the front of the queue and started. I’m now not sure how I can wait for the test to complete .
As a side note, I’m also having trouble with knowing how long to poll for since I’ve experienced being behind 78 tests this week. Any good strategies to deal with this are welcome!
[code]import requests
import pandas as pd
import json
import time
verbose_debug = None
parameters = {
‘k’ : “”, # API Key
‘f’ : “json”,
‘location’ : “ec2-eu-west-1:Chrome.3GFast”,
‘mobileDevice’ : “Nexus 5”,
‘url’ : “www..co.uk”,
‘runs’ : 1,
‘lighthouse’ : 1,
‘timeline’ : 1,
‘video’ : 1,
#‘notify’ : “”
check a the test’s status, returns a tuple
def getTestStatus(testId):
testStatus_Response = requests.get(‘http://www.webpagetest.org/testStatus.php?f=json&test=’ + testId)
resp_dict_status = json.loads(testStatus_Response.text)
statusText = resp_dict_status[‘data’][‘statusText’]
statusResponseCode = testStatus_Response.status_code
print (“" + statusText + "” + str(statusResponseCode) + “****”)
return statusText, statusResponseCode;
Start the test
startTheTest_Response = requests.get(‘WebPageTest’, params=parameters)
#Did the test work? If so, get the test ID
if startTheTest_Response.status_code != 200:
print(“*** Something went wrong. Couldn’t start the test. ")
print("Start Test code: " + startTheTest_Response.url)
print(“Start Test text: " + startTheTest_Response.text)
print(” Test started successfully. Response: " + str(startTheTest_Response.status_code) + " ***”)
if verbose_debug:
print("Start Test text: " + startTheTest_Response.text) # print the full response for debugging
# Get the testId & resultsUrl
resp_dict = json.loads(startTheTest_Response.text)
testId = resp_dict['data']['testId']
resultsUrl = resp_dict['data']['summaryCSV']
print("resultsUrl = " + resultsUrl)
print("testId = " + testId)
# TODO: Get the number of tests from the queue and make the loop wait longer
count = 0;
statusText, statusResponseCode = getTestStatus(testId) # assign the return values from getTestStatus to variables
while statusResponseCode == 404:
print (str(count) + " results not ready. Status: " + statusText)
count += 1
statusText, statusResponseCode = getTestStatus(testId)
if count > 50:
if requests.get(resultsUrl).status_code != 404:
print("exited loop")
#Grab only the columns I want
testResults_Response_dataFrame = pd.read_csv(resultsUrl)
df1 = testResults_Response_dataFrame[['SpeedIndex','FirstInteractive','loadTime','lighthouse.Accessibility','lighthouse.SEO','browser_version','date','URL']]
# Print all columns