Using WebPageTest for Continuous Integration Activities

Hi WebPageTest Community,

I’m looking to see if anyone in the community has put thoughts or even efforts around using WebPageTest within a Continuous Integration framework. I’m not specifically talking about integration WebPageTest with tools like Jenkins, Hudson or Bamboo, though that would be awesome if anyone has made progress in that front.

What I’m really getting at is whether there has been a lot of thought around different Use Cases for using WebPageTest as part of a CI framework?

Also, from an implementation perspective of automated CI, is the best example, Marcel Duran’s webpagetest-api work that allows CLI invocation?

I’ve also seen Andy Davies slides from Velocity 2012, but that doesn’t really tackle my question…

Curious what others are doing in the space of Performance and CI?

Regards,
Steve

I’d say the Marcel’s work is leading the charge in that space (and does have Jenkins integration as of last week): Test specs · WebPageTest/webpagetest-api Wiki · GitHub

Marcel gave a talk at Velocity this year about how Twitter has it integrated into their CI process and that is the driving force behind a lot of his work on the Node wrapper.

Hi Pat,

Thanks for the timely response. I definitely went to Marcel’s presentation. It was fantastic. I even blogged about it here: Even Twitter Does Performance CI | Seven Seconds

I guess the feedback he must have gotten at Velocity must have prompted him to do the Jenkins integration with WebPageTest. I believe at Velocity he did a demo of phantom.js with Jenkins. So that’s pretty cool.

I’m still curious about the use cases others may be focused on for using WebPageTest in an automated and CI fashion. The CI tool integration is definitely a required step.

I personally want to do regression comparisons between commits, as well as browser comparisons. I also would like to do automated rule analysis as well. I think to make it happen, I kind of need to use WebPageTest to create a HAR, then extract the contents of the JSON object to then evaluate programmatically. Those are some of my goals…curious what other goals folks had.

Thoughts?

Regards,

Steve

I think Marcel’s implementation is the best I’ve seen so far.

I’ve got a client who’s implemented this webPageTest Jenkins setup for webpagetest using jmeter and the jenkins performance plugin but it’s a bit on the clunky side.

Writing a decent Jenkins plugin is on the long list of things I’d like to do.

Thanks Andy…much appreciated for your response as well as Pat’s earlier response. I’ve listened to you at Velocity and seen your presentations. Maybe you could share with me some of your reasons for using WebPageTest over let’s say phantomJS or even using Selenium with BrowserMob to generate HARs?

I’m trying to find some practical use cases that other developers will share with me. I’m a little torn on what direction I want to go. I have a huge investment in Selenium. I won’t abandon it, but I don’t necessarily want to duplicate it.

Regards,

Steve

What would your perfect solution look like? We’ve thrown around the idea of having the WebPagetest agents be able to passively record performance data as part of normal CI testing for exactly this case but I assume at a minimum there would have to be some modifications to the existing selenium scripts:

  • some way to start the capture (either command-line or something exposed on the DOM like window.webperf.record(“blah”); )
  • some way to stop the capture
  • Maybe an atomic way to mark the end of one capture and the start of another

Given that most selenium tests are of the multi-step variety there would have to be some way to name the individual steps.

The output would also have to be decided upon. Uploading to WebPagetest and creating “tests” for each step would be the easiest and allow for richer data (video, timeline, tcpdump, etc) but it might be possible to do a local HAR dump as well.

No promises, it’s just an area we wanted to move towards as the core technology is pretty stable now (at least on Windows).

Here are some of the reasons WPT was chosen in this case:

  1. Also wanted to an instance of the HTTP Archive to track performance trends over time - in this instance of the HTTP Archive can click on a chart to dive into the test results.

  2. When test results are out of normal WPT gives an easy way to examine the results, repeat tests etc.

  3. Starting to use some metrics e.g. SpeedIndex that are only available in WPT

  4. Allows testing performance across browsers including iPhone and iPad (and where third-party content is involved it not unusual to find it varies by browser)

There have been discussions in the past about how proxies like BrowserMob may to alter connection behaviour (not sure how much it still applies but there are likely to be some effects, due to the reduced latency to the proxy, and the internal behaviour of the proxy)

In other situations I might be tempted to go with the PhantomJS approach - Wesley Hales & Ryan Bridges covered this a bit http://cdn.oreillystatic.com/en/assets/1/event/94/A%20Baseline%20for%20Web%20Performance%20with%20PhantomJS%20Presentation.pdf

My only reluctance about an approach based on PhantomJS is the limited information we can get out of it, and the current limitations of the HAR format i.e. it’s missing entries we commonly used in performance testing.

I’m really not sure what the perfect solution looks like. Recording data passively similar to a RUM system would be really interesting. I think what you are suggesting is more ideal then let’s say a BrowserMob integration with Selenium, which is the direction most folks go when they want HAR analysis from their Selenium scripts.

Of course there would have to be compromises, but I doubt the compromises would be substantial…
[hr]
Andy,

Sounds like we are all coming to the same conclusion that WPT gives us the most complete picture right now that the others do not. Seems to be more fidelity of data with WPT that just gets missed with Selenium/BrowserMob and PhantomJS.

Thanks for the replies…

Steve

I’ve only just seen this, so a bit late to the party - we have had WPT running in CI for at least 12 months using ThoughtWorks GO.

Sure we had some pain, but it was mostly to do with standing up windows desktops rather than anything really thorny. That and running out of disk space for the mySql back end - its amazing how much data WPT generates.

@hsiboy

Rather than focusing on a CI, we’re working to scale WPT. Individual WPT runs (even in batches of 10) are not entirely useful, since a bad network connection or momentary web server issues lead to bad performance numbers.

I wrote a node wrapper that uses the webpagetest-api node module that makes scheduling large number of tests easier and stores the data in mongodb. I can say, run 1000 tests on Chrome and it will load balance across all WPT instances with Chrome installed in batches of 10. We’re still working on the sprint release testing practices, but currently I run 60 tests on 10-15 page types in all supported browsers (including iPhone and iPad and hopefully soon android 4.4 when I can get it working).