I have a big bunch of benchmarks running on a private EC2 instance. For 18 sites, I have tests running on 4 pages each with each test sampling 7 times.
These tests are split into groups (otherwise the Benchmark graphs are impossible to interpret - it runs out of colours!).
Now it appears that the tests are run correctly. However, not all results appear in the graphs. For instance, for site ‘A’ the results appear in the graphs for page 1, but not pages 2-4.
How are the entries keyed for inclusion in the graphs?
Each .php file shows up as a different labeled section (you can have multiple). If the 4 pages each are all similar like “front page”, “article”, etc then I recommend splitting them out into separate benchmarks where each benchmark compares the same page across all 18 sites.
Otherwise 18 separate benchmarks with 4 pages each also works if you want to compare each site against itself.
Within a benchmark file, each $configuration entry is a different line on the graph and each “location” is a different set of graphs (which can be useful for looking at different speeds or browsers).
There is no issue with the benchmarks being run - they all run OK. What is the issue is that often they don’t appear in the graphs.
My set up has the following benchmark PHP files:
- Page type A - sites 1 to 8
- Page type A - sites 9 to 16
- Page type A - sites 17 to 18
- Page type B - sites 1 to 8
- Page type B - sites 9 to 16
- Page type B - sites 17 to 18
- Page type C - sites 1 to 8
- Page type C - sites 9 to 16
- Page type C - sites 17 to 18
For the past month, I’ve had all of these benchmarks kicking off at the same time - 01:00 GMT. In this case, many of the results don’t appear in the benchmark graphs (though the tests ARE being run). The pattern of lost points in the graphs is mostly consistent - for instance, point for site 1 will appear for page type A, but not for B or C. Similarly, site 2 shows for Type A, but not B or C.
It looks like appearing in the graphs is perhaps keyed by a short string, or domain, and that there is de-duping on that key for the same test initiation time?
I’ve confirmed this last night by moving all Type B benchmarks to 02:00, and all Type C to 03:00. Now all tests appear in the benchmark graphs…
Page labels in the urls file might help. For example, on the sample benchmark where I compare news sites against each other: http://www.webpagetest.org/benchmarks/view.php?benchmark=news_desktop&aggregate=median
If you look at the top of the page there are links to the urls files for the different companies and it is in the format . From the sounds of it you will actually end up with a separate URL file for every URL you want to test and each config pointing to that one file.
So benchmark config #1 would have 8 different configs, one for each site and each one pointing to a separate urls file. Each url file would look like:
Type Ahttp://site1.com/A
(with the using an actual tab character).
Already have labels in the URL files, but they are not unique (i.e. they just name the site, not the site/page type combination).
Are you saying points appearing in the graphs are keyed by label and thus have to be unique for tests scheduled for the same time?