Very fast load with white gap in render time / time to complete

This is one that completely perplexes me, I’ve done a lot of optimizing and reasonable use of APC (cache) to get great page load times, however there’s a gap in the page render that appears on our site.

Here’s typical test results:
http://www.webpagetest.org/result/140315_Q4_86bb499ab14a867a5d033dc4be9097b3/

Another example of a gap (0.8s to 1.2s):
http://www.webpagetest.org/result/140316_KG_eef40e1b0fdbcfe1fec86131322043af/3/details/

**a further point is that we can’t get repeat views to less than 0.6s despite everything being extremely fast in the php. We now have image dimensions specified - could it be JS or something else blocking rendering?

(Tested from Australia servers).

May I please know if anyone has seen such results? We have Xdebug installed and there’s very little else slowing the php that we can currently see. We previously did remove some getimagesize() functions in banners as they took up time in processing backend but they have been replaced with faster hard coded image width and height specification.

Any suggestions welcome!

Kind regards,
SolarisM

I think you’re running into hardware limitations of the test machine. The gap you describe begins around “Start render” and ends after “DOM content loaded”, and the CPU utilization graph tells you it’s pretty much fully occupied at that time. It’s rendering the page.

Your repeat view load times are dominated by Google/DoubleClick (uncacheable stuff). Nothing you can do about that, I’m afraid.

Performance looks quite good to me, even from Europe. Well done :slight_smile:

I’d look at why the pre-loader is cancelling the request for hobbywh_stylesheet_053_min.css

I’d also consider concatenating the CSS into one file and serving from the primary domain - many browsers pre-emptively set up a second TCP connection in the expectation the next request will come from the same host

Thank you very much for your replies. In relation to hardware limitations, I suspected that may be true and repeated tests from Dullies USA with the same results, would it still be the case? All Google/Doubleclick are deferred and do not start until the page has “loaded”. If it is indeed the test system h/w limitation then I’ll be happy with that, it’s just the obsessive side of me that thinks more may be optimized.

Is there anywhere to check this is the case?

RE cancelled request, that’s a good suggestion - we’ll definitely concatenate the css and try to serve them from our primary domain. I think the only issue we’ve had in the past is the CDN tends to perform better than our server for those located around to world.

EDIT: Just remembered why we serve the css from the CDN, it’s because all of the sprites etc would otherwise need full paths to load their images via the CDN. SSL pages need a totally different url as well as we’re using Amazon cloudfront (and the custom SSL is a bit pricey still). Any work around?

Thanks again.

While the Dulles machine is probably the most powerful of the bunch, the browser you’re testing may run in a virtualized environment with limited resources. You’re maxing out the CPU there, too: http://www.webpagetest.org/result/140317_ET_DKN/1/details/cached/

If you want to learn more about diagnosing rendering performance in with the Chrome DevTools, here are a few interesting reads:

ok thanks, so this shows it is mostly rendering within 100ms on my system (repeat view), 60ms of which is the first time to response. Then there’s lazy loaded stuff which is not in our control such as analytics js and image preloading.

There isn’t a great deal more I think we can do except inline above the fold css which I find is quite difficult to calculate, and I’ve tried using Paul Kinlan’s “amazeballs” code to do that but it fails for some reason.

Out of interest, does anyone have a similar tool to identify critical css to inline?

For those interested, I mostly used APC store and fetch statements (that simple) after profiling our php using xdebug. There’s more that can be cached, however it would start to get more complicated for 2-3ms here and there which is probably not worth the effort. A few extra indices in mysql tables also reduce a few queries quite a bit, some from 35ms down to 3-4ms which in the end add up!

It seems you’ve taken care of all the low and high hanging fruits, and while I understand the desire to go for “the full monty” – the tiny fruits at the very top, if you like – there comes a point where your time and energy may be better spent elsewhere. Shaving off an extra millisecond is, after all, unlikely to affect your bottom line, or your users. You’ll also want to carefully weigh performance improvements against increased site complexity (as well as impact), especially in light of inlining above-the-fold CSS.

If you do want to go ahead with inlining above-the-fold CSS, the PageSpeed module has a new feature named “prioritize_critical_css” that you could look into. Even if you don’t use Apache or nginx, or you don’t want to use the module, you could still use it as a tool (on a private reverse proxy, for example) to detect the critical CSS, and then manually implement the information it provides.

There’s also a bookmarklet, but I haven’t had time to play with it yet.

I love APC and use it whenever I can for simple data, but have yet to develop a reliable invalidation technique to get more out of it.

Thanks very much the bookmarklet is the code from Paul Kinlan that doesn’t seem to give any results for our site! Possibly because of layers or some other possible related reason - it simply doesn’t give any output, but works on other sites fine that I’ve tested it against.

EDIT: I just realized the bookmarklet may not work on our site because css is pulled in from a CDN.

APC works perfectly for us, where we cache “chunks” that are then output when required - eg headers, footers, navigation menus change infrequently but most sites recalculate them on every single mouse click. We also cache product grids / boxes, and frequently access lists and arrays that take time eg our url to product maps and product filters.

The next thing we may do is cache the results of top 200-300 search results and update them on a daily basis or something, however our search code is working pretty damn fast even with “smart guess” code now in there and spelling correction / similar search result suggestions. We don’t yet do any active invalidation on data changes, but it would be easy to delete specific cached user data when say a product is updated or a menu has changed.