All help greatly appreciated in interpreting

chrisn · May 30, 2012, 7:31am

Hi there,

We recently upgraded to vbulletin 4 and have been fighting a hard battle to get speed back where it used to be.

We’ve done a lot of server side optimisation which has made a big difference, but we still have a long way to go. We now feel that front end optimisations will be where we can yield the big impact for users.

I’d be really really grateful of your expert views on these results and where we might best focus our efforts.

http://www.webpagetest.org/result/120528_KH_BAP/1/details/

Many thanks

andydavies · May 30, 2012, 11:10am

(any ideas why start render is missing from the waterfall)

You need to prioritise the request loading e.g. CSS first, JS last, lazy load as much as you can

Is there no way you can merge up the CSS files i.e. one CSS file only?

Number of DOM is going to be on the large side (but that’s always an issue with discussion boards)

There are opportunities for sprites too…

I am UK based - will drop you a PM

pmeenan · May 30, 2012, 7:36pm

Marvin has done some amazing work with his vbulletin install - http://www.webpagetest.org/forums/showthread.php?tid=3326 It wouldn’t hurt to ping him and compare notes.

It looks like you may need to do a little work on back-end scaling as well. How is it hosted if you don’t mind me asking (cloud, VPS, shared, dedicated boxes)? Particularly the database. Throwing SSD’s at the database can have a huge impact for something like a discussion forum is it’s an option available to you.

Otherwise there’s basicaly a lot of block-and-tackle work combining the css requests and getting the javascript out of the way. I also recommend testing on IE8 or newer because IE7 and below are pretty pathalogical and are an extreme worst-case.

chrisn · May 31, 2012, 5:17pm

[quote=“andydavies, post:2, topic:7540”]
(any ideas why start render is missing from the waterfall)[/quote]

No idea. strange that. Here’s another one and it appears fine:

Thanks. Will draw up a list of all those obvious gotchas and start prioritising I guess.

Hoping that someone might spot “the big one” in those results

Is the CPU maxing out an issue? We have seen a particular user activity impact in IE and on lower screen resolutions and we’re wondering if that is suggesting lower end machines are struggling more than faster ones?..

[hr]

pmeenan:

Marvin has done some amazing work with his vbulletin install - WebPageTest Forums It wouldn’t hurt to ping him and compare notes.[/quote]

Thanks. Sounds great. We’ll give him a shout.

pmeenan:

It looks like you may need to do a little work on back-end scaling as well. How is it hosted if you don’t mind me asking (cloud, VPS, shared, dedicated boxes)? Particularly the database. Throwing SSD’s at the database can have a huge impact for something like a discussion forum is it’s an option available to you.

We’ve got what I think is a pretty mega back-end.

We’re running a virtual set-up on a private cloud - with a physical capacity of 48 cores and 384 GB RAM. 4 x 128GB SSDs in Raid 10 for the DB. The virtual layout includes 8 webservers, a couple of varnish servers, a couple of memcache servers, plus one master and two slave DB’s and we devolve search to Solr. We’ve had the guys from Percona fine tune the DB’s and queries.

We’re doing a bit more work on Memcache and Varnish optimisation, but feeling we’re likely to yield more from front end optimisation now.

[quote]Otherwise there’s basicaly a lot of block-and-tackle work combining the css requests and getting the javascript out of the way. I also recommend testing on IE8 or newer because IE7 and below are pretty pathalogical and are an extreme worst-case.

Yeah. The slog lies ahead I guess. Any clues in what might be the biggest wins in this front end work would be very gratefully received. Keen to get the biggest gains first

Thanks for all your help guys.

chrisn · May 31, 2012, 9:58pm

I saw your advice on another thread about using Dynatrace Ajax Edition Patrick - it looks super handy…now to try and get my head round it.

If anyone has a moment to share anything they can learn from looking at Dynatrace results on Mathematics Applicants 2012 - The Student Room i’d be very grateful.

I’ve uploaded the js screen - is there something up with that?

pmeenan · May 31, 2012, 11:27pm

FYI, there are “Dynatrace” configurations available in the Dulles location (IE 7 and IE 8) which capture a dynatrace run and let you download the session (also lets you share it with everyone else to work off of the same one).

I’m running a test right now but you generally want to look at the hot spots.

Yes, IE 6 and 7 have HORRIBLE javascript performance, particularly around inefficient selectors. If there is Javascript on your site that does something like $(“.someclass”)… then the older IE’s will need to traverse the entire DOM (slowly) every time it is called. That will cause CPU pegging and gaps in the waterfall and will certainly be worse for older computers with older browsers.

Here is the test I ran (still running at the time I posted this): http://www.webpagetest.org/result/120531_SD_1E8M/

I’ll take a look when it is done to see if anything jumps out.

pmeenan · May 31, 2012, 11:54pm

Looks like this chunk of code is in an inline script in the HTML:

    $("p:last-child").addClass("last-child");
    $(".fieldset:last-child").addClass("last-child");
    $("table:last-child").addClass("last-child");
    $("tr:last-child").addClass("last-child");
    $("td:last-child").addClass("last-child");
    $("tbody:last-child").addClass("last-child");
    $("thead:last-child").addClass("last-child");

It took close to 4 seconds of CPU time to execute in the test that I ran.

If you open up the “Pure Paths” UI and sort by CPU time (descending) it will sort the code execution from most expensive to least.

Looks like the Facebook javascript was the 2nd most expensive with 2 seconds of execution time. I haven’t looked at it closely but if you have an option to tell the facebook code the ID of widgets it needs to populate that would reduce the overhead a lot (assuming it is scanning for a specific class).

3k+ DOM elements is a pretty huge DOM (which is why it is so expensive). Not sure if there’s anything you can do about that but it would be worth looking into to see if there are easy options with the layout that may be inflating it.

chrisn · June 1, 2012, 3:45pm

pmeenan:

Looks like this chunk of code is in an inline script in the HTML:
    $("p:last-child").addClass("last-child");
    $(".fieldset:last-child").addClass("last-child");
    $("table:last-child").addClass("last-child");
    $("tr:last-child").addClass("last-child");
    $("td:last-child").addClass("last-child");
    $("tbody:last-child").addClass("last-child");
    $("thead:last-child").addClass("last-child");
It took close to 4 seconds of CPU time to execute in the test that I ran.

If you open up the “Pure Paths” UI and sort by CPU time (descending) it will sort the code execution from most expensive to least.

Looks like the Facebook javascript was the 2nd most expensive with 2 seconds of execution time. I haven’t looked at it closely but if you have an option to tell the facebook code the ID of widgets it needs to populate that would reduce the overhead a lot (assuming it is scanning for a specific class).

3k+ DOM elements is a pretty huge DOM (which is why it is so expensive). Not sure if there’s anything you can do about that but it would be worth looking into to see if there are easy options with the layout that may be inflating it.

Ah great. That’s brilliant having it on the Dulles install.

Thanks so much for your help. It’s really shining a light on things

I’ve been having a good dig and am a bit confused about this…using your first test as an example…

When I sort by CPU Time within Pure Paths I am commonly seeing that last child problem at 3400+ ms and a readystatechange event on at 2500ms+ (which seem therefore to be a problem)

What is throwing me is that their start times are 7s and 11.5s roughly. So I looked back at the waterfall to see if that is where we are getting the gaps/spread out items & CPU maxing and the problem seems to begin much earlier i.e. around 3s

I look to see what JS is going on at around 3s and actually it says the start time for the html is 3.38s - which contradicts with the waterfall which shows it running from 0-1.5s

The first significantly slow js seems to start at 6.3s with a CPU time of 765ms. Again too late according to the waterfall

Should the timings from Dynatrace and the Waterfall synchronise?

How can I see which JS is causing problems at the point in time where we are getting the big gaps and CPU max in the waterfall?

Am I missing something and maybe it isn’t JS that is holding up the beginning of these waterfalls.

Probably worth adding that I am more interested in the cached views at the moment as the problem is impacting logged in users most.

I did another run if that helps: WebPageTest Test - Running web page performance and optimization tests... (Edit: Changed to one with dynatrace too)

Thanks so much for your help

pmeenan · June 1, 2012, 3:48pm

Sorry, we start dynatrace before we start up the browser so the times won’t line up exactly. If you go to the Timeline view you will be able to see the javascript execution and the matching network waterfalls. If you double-click on the javascript (particularly any long-running ones) it will take you to the call tree.

chrisn · June 1, 2012, 4:10pm

Thanks. Off to dig!

It’s tricksy stuff this…Any insights that anyone is able to provide will be rewarded with lashings of positive karma (swapable for beer in all good local ale houses) and masses of respect and smilies

pmeenan · June 1, 2012, 4:15pm

Have you tried running NewRelic or Dynatrace on your backend? At your scale it’s probably something you want to leave running all the time (rather than just for a quick trial) but it will tell you where your hotspots are for the backend (specific database calls, etc).

It does sound like a well though-out back-end (a little surprised the times are not better).

I’ll take another look at the front-end (particularly repeat view) in a minute.

pmeenan · June 1, 2012, 4:52pm

Never mind, just noticed the newrelic beacons on the page :-).

Looking at the repeat view of the test you ran, 1.5 seconds is extremely long for the base page - does your server-side instrumentation show that calls are taking that long? That base-page time is going to drive a lot of user experience/feedback in a forum situation where they will be clicking around a lot.

Looking at the filmstrip view for repeat view there are a couple of interesting things to look at: http://www.webpagetest.org/video/compare.php?tests=120601_B9_FVY-r%3A1-c%3A1&thumbSize=200&ival=100&end=all

If you scroll the filmstrip, a red line will track in the waterfall with the left edge of the filmstrip window (time-wise) so you can see what is blocking the UI and when certain events happen.

The top part of the page loads pretty quickly after the HTML comes back
The actual posts are blocked by the ads (at least one of them) and show up at 3.1 seconds. If you can get your ads to load asynchronously then they won’t block the display of the content (adsense and doubleclick will both load async but I didn’t dig too deep to see what you’re doing yet). Looking at the code, it looks like it’s probably the ADVERTPRO ad blocks that use inline document.write’s to write script tags
The ad loading triggers the widgets UI to show which reflows the page. If you set up the main part of the page to have the correct initial width that could be avoided (not sure how easy it is to do with what you have).

Some other notes:

I don’t see evidence of it causing a problem in the repeat view waterfall but it looks like you have some IE conditional comments in the HTML after some external resources are loaded. This can cause blocking behavior in IE and it can be worked around by putting an empty conditional comment before any resources are loaded (like at the top of the head): Conditional comments block downloads / Stoyan's phpied.com
You have some inline javascript sandwiched inbetween other resources. This can also trigger blocking behavior inside of IE. It looks like they are all tags and independent of the external js so if you could you should either move them before all of the external resources or move them lower: Performance Matters: Avoid the "inline javascript sandwich"

chrisn · June 2, 2012, 7:38am

Thanks Patrick. That’s really great. I’m going to be offline for a few days now for the Queen’s Jubilee antics but will properly digest and come back to you next week. Just wanted to say thanks and let you know I’ve seen your response for now. Much appreciated

Marvin · June 2, 2012, 12:29pm

Hi Chris.

The way I attacked vB performance was to first remove all unnecessary forum features (but this is not something everyone may be open to doing). Looking at your site I’d suggest to consider if it is necessary to have the right-column widgets (Discussions, Article Updates), the “What’s Going On” module on the bottom of the home page, the “Display Options” and the “Moderators” modules on the bottom of forums, is it necessary to show how many “Views” each thread has had, gender, reputation, country flag? And I would even remove the “Previous Thread” and “Next Thread” links from the threads bottom, as I think people don’t really use it.

I’d look into reducing the number of requests. The test link you shared showed 136 requests for the page. Here’s a list of the images found on that page:
http://www.webpagetest.org/pageimages.php?test=120528_KH_BAP&run=1&cached=0
(The flags_sprite.png alone is 106kb)

On my forum I removed ALL vBulletin images. The orange reply/post/quote/tools/go buttons could be replaced with html. Every little bit helps.

My philosophy is that people come to forums for content, and all the pretty stuff like images and unneeded features are just a distraction.

Just my 2c

P.S. My vBulletin forum can be viewed at laptopgpsworld [dot] com. I recently changed my pagination settings so that forums contain 200(!) instead of the default 25 threads per page, and threads may go up to 100 posts before the page splits, but it still seems to have OK load speed. Members see a slightly different version than guests.

chrisn · June 5, 2012, 5:24pm

It’s a great view this, but I can’t work out how to reach it from my own webpagetest results…how do you navigate to it from test results please?

tallenge · June 6, 2012, 7:04am

[/quote]

chrisn · June 6, 2012, 2:25pm

Found it… Need to tick the right setting before submitting the test…Advanced Settings>Video>Capture Video and then it’s obvious in the results (for anyone else who cant see its benefit).

I’m finding it the best view to start from this one personally.

[hr]

Marvin:

Hi Chris.

The way I attacked vB performance was to first remove all unnecessary forum features (but this is not something everyone may be open to doing). Looking at your site I’d suggest to consider if it is necessary to have the right-column widgets (Discussions, Article Updates), the “What’s Going On” module on the bottom of the home page, the “Display Options” and the “Moderators” modules on the bottom of forums, is it necessary to show how many “Views” each thread has had, gender, reputation, country flag? And I would even remove the “Previous Thread” and “Next Thread” links from the threads bottom, as I think people don’t really use it.

I’d look into reducing the number of requests. The test link you shared showed 136 requests for the page. Here’s a list of the images found on that page:
WebPageTest Page Images
(The flags_sprite.png alone is 106kb)

On my forum I removed ALL vBulletin images. The orange reply/post/quote/tools/go buttons could be replaced with html. Every little bit helps.

My philosophy is that people come to forums for content, and all the pretty stuff like images and unneeded features are just a distraction.

Just my 2c

P.S. My vBulletin forum can be viewed at laptopgpsworld [dot] com. I recently changed my pagination settings so that forums contain 200(!) instead of the default 25 threads per page, and threads may go up to 100 posts before the page splits, but it still seems to have OK load speed. Members see a slightly different version than guests.

Great. Thanks Marvin. We are looking to see how we can strip things down. Not sure we’ll get away with going quite as far as you have managed, but it’s certainly a good gauntlet to lay down to focus the mind! Thanks

pmeenan · June 6, 2012, 5:28pm

Sorry, meant to reply sooner (did a day trip to California - I’m too old for that anymore). I’m considering making video capture on by default and including the filmstrip UI in the normal test result UI just because it is so useful. I just need to experiment with it a bit to make sure the storage requirements don’t explode.

chrisn · June 14, 2012, 11:06am

Please could you clarify where you are seeing this Patrick. It feels important, but we cant see it ourselves. Thanks

pmeenan · June 14, 2012, 3:23pm

If you go to the filmstrip view: http://www.webpagetest.org/video/compare.php?tests=120601_B9_FVY&thumbSize=200&ival=100&end=all

Then scroll the filmstrip until the post content is the first frame displayed (4.1s), the Red vertical line will indicate the same time marker in the waterfall so you can see what it looks like blocked the content from displaying.

At the time it looked like it was blocked on the ads php calls but that was for a different run. It looks like there is some code in that area of the page but it’s hard to tell right now since it looks like the structure of the code changed from when the test was initially run.

Topic		Replies	Views
First Byte Time “FBT” findings and research results Discuss Test Results	27	217	January 17, 2016
Help me to Analyse the Performance and get to the Root cause Discuss Test Results	23	178	May 30, 2014
Understanding my results: slow load times Discuss Test Results	24	214	January 8, 2016
Start Render Time Discuss Test Results	54	309	July 14, 2010
Results of bulk test analysis Optimization Discussions	14	201	May 14, 2010

All help greatly appreciated in interpreting

Related topics