Hosted WebPagetest: Test id generation bug: private tests are out of sequence + other

Hello,
Thank you for making hosted webpagetest free and open for all!
Found two bugs in the id generation routines, which may prevent submitted tests from running in sequential order of their submission time:

  1. Private tests Ids contain no sequential part, and will run by submission order:
    on runtest.php:

             if( $test['private'] )
    
                 $id = md5(uniqid(rand(), true));
    
             else
    
                 $id = uniqueId();
    

Since the entire id is hashed, it is not part of the order of other tests. I suggest changing it to be part hash, and part of the sequentially generated id, something like:
$id = uniqueId() . “_” .md5(uniqid(rand(), true));

(Have to mention the the use of PHP’s uniqid() function, and WebPageTest’s uniqueId() confused me a bit).

  1. The ID generated by uniqueId() in unique.inc is not alphabetically ordered, when the number of encoded characters generated by it changes. For example, you might generate jobs with those ids and filenames:
    110213_ZY.p2
    110213_ZZ.p2
    110213_001.p2
    110213_002.p2

in getwork.php, the use of scandir() function, will return 110213_001.p2 and 110213_002.p2 first, and only then will get back to ZY and ZZ.

This could be easily resolved by adding a padding to the encoded ID, so it will be of fixed length, i.e 00000000000ZZ, and then alphabetical ordering would work fine.

Thanks!

Amitay Dobo
Acceloweb - http://www.acceloweb.com

Thanks for the additional eyes on the code…

We don’t actually need or care if the ID’s are in order, just that they don’t collide (and in the private case that they aren’t easily guessable). Collisions are actually a bigger concern for the private ID’s but there are now 2 random parts to the ID and a check is made for an existing test with the same ID so the likelihood is significantly reduced.

Hmm, sorry about that, I started out the project not knowing much about php and that was one of the first pieces of code - I’ll go back and change the name to make it less confusing.

Instead of depending on the file names to keep sequence we changed the work queue to look at the file times and sort by the create time of the job files. There have been quite a few improvements to the code since the last release of the private agent (this included) and I’ll be packaging up a new release today.

Thanks,

-Pat

Thank you for the prompt reply!
Admittedly I was looking at the last release code, and not the trunk.
If in the next release the jobs queue is determined by file dates and not names, it renders the two problems as non-bugs.
My concern was only that tests will run in the order they are submitted, which the “old” filename-order based mechanism had problems with private tests and uniqId() tests.

And a final question, should I not “trust” the jobs order as retrieved by a client to be in the exact order of submission? I’m not sure what is the modification date resolution in different operation system, but probably collisions can happen there too.

So thanks for taking care of that (even before I reported the issue :))

Amitay Dobo
Acceloweb

Yeah, if you need a sub-second level of guarantee you’ll probably have to wait until we implement a real queue (adding support for integrating with SQS is on the to-do list). Most of the filesystems that I know of have at least 1-second granularity on the file times.

I’m packaging up the new release right now so it should be ready in an hour or so.

Thanks,

-Pat