different traffic shaping behaviour between version 2.15 and 2.18 agent

Hi Patrick,

I have a problem with the traffic shaping on our private desktop agents, I hope you can help with advice.

As you maybe know we use a patched webpagetest version for multistep measurements (see our fork/PR).
We used to measure with a multistep version based on your version 2.15 (2ddcdae2a0e1707a9a3dd2ce08f19057847c01c7 from January 2015). Some weeks ago we merged your version 2.18 (9c09213a2697a639c4bfbed643b172c114afd487 from September 2015).
We fixed our multistep functionality for the new version. With the PHP code it was quite a bit of work but now everything is fine and it works.
In the agent code we didn’t have to do a lot to get it to work. The old multistep agent based on wpt version 2.15 also works with the new server code.

The problem: Version 2.18 agents measure slower load times with the same connectivity configured :frowning:
Here you can see graphs of the measurements. You can click on the data points to get to single webpagetest results.
The slower load times come from slower connectivity → ttfb and download times of many resources are slower with version 2.18 agent.

If I measure with native connectivity I get same, very fast load times on both agents (about 300 to 400ms instead of about 1,000/1.,800ms with traffic shaping). Measurement on 2.15 agent runs with lower frequency.

So the problem seems to be related to dummynet / traffic shaping. I compared dummynet on both agents (2.15 and 2.18). Both agents run on root servers (no virtualization, 64bit, windows 7, actual browser versions) and use the same 64-bit dummynet version. I re-installed dummynet on both agents - no effect.

I now there have been maaaany commits between January and September this year and maybe this is question is not very fair, but:

If you have any idea about which changes between version 2.15 and 2.18 in agent / traffic shaping code could cause the changed load behaviour I would appreciate it very much.

Regards,
Nils Kuhn

There are only 2 changes that I’m aware of and neither should have caused the kind of behavior you are seeing.

Here is the history for the changes for talking to ipfw: History for agent/wptdriver/ipfw.cc - WPO-Foundation/webpagetest · GitHub

1 - Switched to always use the ipfw command line instead of talking directly to the driver (32-bit only)
2 - Fixed support for packet loss (shouldn’t matter as none of the default profiles enable packet loss)

If you are using custom profiles with packet loss then #2 could certainly cause it.

The actual dummynet configuration rules haven’t changed wither but you can berify by looking at your ipfw.cmd in the dummynet folder to make sure they match.

Only other thing that comes to mind is the fix to dummynet initialization: Improved the dummynet initialization so we don't get test machines ru… · WPO-Foundation/webpagetest@d065d95 · GitHub but that was a few years ago.

We always use custom profiles if triggering measurements from openspeedmonitor. But within concerned measurements we set plr=0.
So by If you are using custom profiles with packet loss you mean custom profiles with plr>0, I assume?

Correct. As long as plr = 0 there should be no change

Hi Patrick,

we did some tests with your official 2.18 WPT-Server, an official 2.15 WPT-Desktop-Agent and an official 2.18 WPT-Desktop-Agent. They’re all running your official release from github, not our multistep-version.

The agents are running both with Windows 7 64Bit, an have same hardware-specs.
The agents are getting the same jobs (same preferences, same script).
The agents are using traffic-shaping, and are shaping to DSL6000.

The attached screenshot shows that the 2.18-WPT-Agent (green line) is nearly everytime 1sec slower than 2.15-WPT-Agent (blue line).

Here are some example results from 2.15-Agent:
http://dev.server01.wpt.iteratec.de/result/160224_62_C/
http://dev.server01.wpt.iteratec.de/result/160224_42_A/
http://dev.server01.wpt.iteratec.de/result/160224_YE_8/
http://dev.server01.wpt.iteratec.de/result/160224_HT_6/
http://dev.server01.wpt.iteratec.de/result/160224_87_2/

… and some example results from 2.18-Agent:
http://dev.server01.wpt.iteratec.de/result/160224_GE_B/
http://dev.server01.wpt.iteratec.de/result/160224_FT_9/
http://dev.server01.wpt.iteratec.de/result/160224_GP_7/
http://dev.server01.wpt.iteratec.de/result/160224_5Z_5/
http://dev.server01.wpt.iteratec.de/result/160224_Q5_1/

Do you have any idea about that?

Greetings from Hamburg,
Birger

Sorry, that’s like 2 years of changes. What browser? wptdriver or urlblast?

If you have URLs that show the difference, looking at the waterfalls might help identify what is going on. There were certainly changes around clearing the Windows certificate store, predictor cache and a host of other changes.

Hi,

sorry, but I have to broach the subject again.
We have updated one of the agents, so we have measured with the following setup now:

  • WPT-Server singlestep from github, release 2.18
  • 1 WPT Agent singlestep from github, release 2.15 (windows 7, 64bit)
  • 1 WPT Agent singlestep from github, release 2.19 (windows 7, 64bit)

We measured with Chrome and wptdriver. Here you can see the results.

In the first half of the diagram we measured without traffic shaping (native connectivity), in the second half with the following traffic shaping on both agents:

bw-down=6000 kbit/s
bw-up=512 kbit/s
latency=25 ms
plr=0

Without traffic shaping both agents show same load times. With same shaping configured load times differ a lot :frowning:

If we look into some of the waterfalls we can see that there are lot more resource requests and -bytes in version 2.19 than in version 2.15. But all these requests happen after onload. So until doc ready there are same amount of bytes and requests for version 2.15 and 2.19 with and without traffic shaping.

I really can’t make sense of the different shaping here. But I would like to know which one is the more realistic one and what’s the difference…

Here are two of the faster results with version 2.15 and shaping:
http://dev.server01.wpt.iteratec.de/result/160301_BD_3T/1/details/
http://dev.server01.wpt.iteratec.de/result/160301_YF_31/1/details/

Two of the slower results with version 2.19 and shaping:
http://dev.server01.wpt.iteratec.de/result/160301_4T_2Z/1/details/
http://dev.server01.wpt.iteratec.de/result/160301_18_34/1/details/

Regards, Nils

I don’t think it’s a difference in the actual traffic shaping (the dns and socket connect times look roughly equivalent) but rather changes in processing the dev tools request data from Chrome that improved the accuracy (and particularly bytes) reporting in 2.19.

That said, if you grab the latest agent from the public server (or trunk), I finally got code working that accesses the decrypted byte streams directly for Chrome so the timings no longer require decoding and trusting dev tools and it can report the timings directly from the network like it does for HTTP (non-S) and for all of the other browsers.