Start Render Time

They don’t use it across the board - I know they use it in their search results though:

http://www.webpagetest.org/result/100619_86Y/1/details/#request1

Header shows up so I’m feeling pretty good about correctly identifying it.

What version of apache are you using? Try disabling mod_deflate and see if that makes a difference (just for testing).

Weird, i grabbed the headers with firefox and didn’t see the chunked encoding header. Could be that its not being sent for firefox or that the addon is just not displaying it.
I am using apache 1.3.41. I am using mod_gzip not deflate, though i can try disabling it later tomorrow and seeing the results.

Oh, not sure what the complications are in 1.3. Most everything I’ve read is with people getting it to work in 2.2. I’m almost positive that mod_gzip on apache 1.3 does not play well with chunked encoding.

Ok, i’m slightly confused now. Take a look:
http://www.webpagetest.org/result/100619_efff0f32ebc06a49af1c0e39a501c2a0/1/details/

Thats with the site using chunked encoding for the html doc. Now here is the previous one i showed you without any chunked encoding:
http://www.webpagetest.org/result/100619_a3f3da1c99a7fed5ec494ae1e5c77991/1/details/

Now, going off what you said, the time to first byte should decrease, thus making the start render time decrease as well. As you can see, the start render time definitely did decrease in the chunked version… but if you look the chunked version actually has a larger time to first byte! Why is that? Secondly you can probably see that if i get chunked encoding working it fails to gzip the main html doc, which actually results in a longer total load time. I’m still trying to see if i can get it to both chunk the output and then gzip it too.

p.s. on a side note you might be amused to know why the chunking wasn’t working… It turns out there was a directive in the .htaccess file in the main doc root that said “dechunk yes”, essentially telling mod_gzip to ignore any chunking and collect the chunks to gzip them.

Try putting a 2-3 second sleep in right after your first flush. That will help you see easily if there is downstream buffering that is causing a problem. With the chunking disabled you should see the TTFB take the extra time from the sleep. If it is working correctly in the chunked version that time will be moved to the content download time.

It’s possible that the flush and chunking are both working but mod_gzip is buffering 8KB of data (or some other amount) that is preventing it from actually going out and there may be other settings that need to be tuned. Wouldn’t it be nice if it “just worked”?

Also, run a bunch of runs (5-10) as the results may be a bit variable.

Love the .htaccess rule - I assume there’s some historical (or is it hysterical?) legacy reason like NCSA Mosaic can’t handle chunked encoding so disable it…

Ok, chunked encoding works if i flatout disable mod_gzip:
http://www.webpagetest.org/result/100619_7cb51dd83b9829ffaad868f0c510d598/1/details/

I made the sleep for 10 seconds to rule out any sort of fluctuation. Though the weird thing is that i consistently see 300-400 ms cut off of my start render time simply by disabling gzip for the main html doc (without flushing any output):
http://www.webpagetest.org/result/100619_42d2b005b0dcfe385be6f38c60add8df/1/details/

I am not sure why this is, unzipping/uncompressing the file client side shouldn’t be adding that much delay.
For now, i’m done tweaking for the night, we’ll see if i can get both mod_gzip and chunk encoding working tomorrow. If not i may give mod_deflate a try, i believe its available even for older apache versions.

You could also disable mod_gzip (for text/html so your js and css still get compressed) and use php’s gzip encoding directly.

Otherwise my guess is that mod_gzip needs to be tuned and has a certain buffer size configured that is larger than your initial flush so it sits and waits for more data to be available before compressing and streaming it down to the client.

After a ton of messing around with mod_gzip i eventually just disabled it for html/php pages and used php’s ob_start(ob_gzhandler) with multiple flushes.
However, with that done, i see that chunked encoding indeed does work, but found that it doesn’t really help my start render time. What does seem to help my start render time is a 5 second sleep after the first flush. To me this is quite weird and can’t see why that is the case:

Page with no sleep delay-
http://www.webpagetest.org/result/100621_ec1e4d7da004186b26d5ddc74c6de25c/

Start Render Screen (1.461 seconds):

Exact same settings and everything but with a 5 second sleep delay-
http://www.webpagetest.org/result/100621_459c61de5b4e44dc16d0c7952c68c46a/

Start Render Screen (0.908 seconds):

As you can see, just adding the 5 second delay shifts the start render time from 1.461 seconds to 0.908 seconds. If you look further, you can see that the output of the longer render time includes product images, whereas the second is only the base page skeleton and is outputted sooner (as it should).
So, any ideas on what would cause this?

On the plus side, my optimizations seem to hold across multiple product categories with the same amount of products. Wasn’t sure if that would be the case given the specific way i’m maxing out the connections (spreading out the amount of images assigned to each domain to maximize each one).

Although i am still not sure why the delay helps, but i found out that i can reduce my start render time if i move my initial output flush a bit further up the stack:
http://www.webpagetest.org/result/100621_23ee88ddcaa8406dfc1bbaecfc329380/

The test above simply had the .css output above anything else at an earlier point in the code, which of course makes the code invalid since it is outputted before the doctype declaration… but i can simply move the doctype up too and be good.
Though the odd thing is in order to achieve that start render time i actually had to slow down the processing by 20 ms! If i remove the sleep delay my start render time returns to 1.300 seconds, but if i keep the delay my start render time decreases to under 1 second… Its the weirdest thing ever. The only thing i can guess is that the browser needs some time to process the css before starting the render, which the delay gives it a chance to do before the cpu is bogged down by the rest of the page.

Be a little careful not to over optimize for the webpagetest testers as you might impact real users negatively (particularly when adding sleeps). AFAIK, IE basically refreshes the UI when it gets some idle time and if it is busy laying out the page or queuing thing up it gets pushed back. That would be my most likely guess as to why the render gets delayed in the streamlined case - the resources are coming down frequently enough to prevent it from refreshing the screen.

Well, does webpagetest accurately behave like a general user IE install? Because even if the above is the case, if the delay does give it a chance to output to the screen earlier, then to me that behavior is welcomed even at the minor expense of a 20 ms delay for the total load time. To me a 20 ms delay for load time vs. 500 ms increase in start render is a good tradeoff.
But you’re correct, i wouldn’t add any more delay than that, i actually originally started with a 200 ms delay and worked my way back to find the minimum delay needed to achieve the same results.

Yes, WebPagetest runs just like a normal user but a “single” normal user with a very specific hardware and network configuration. At a minimum, test it from the different locations and at different bandwidth settings to make sure it is consistent and not just specific to the hardware running at the Dulles location.

Yes, i’ve been wanting to do that for a while, but isn’t Dulles still the only test location that supports the setDns scripting? I did test it at the Dulles location at different speeds and got the same start render time improvements.

All of the locations besides San Jose and China SHOULD have the SetDNS scripting code now. I hope to get those last two updated when I get back from Velocity.

I tried MaxCDN and w3-total-cache but it interfered with my Thesis wordpress theme and the site would not load correctly, so I went back to my hyper cache and cron-job for resetting the cache once a day.

I would love to get my start render time down, but what is the most beneficial way to do so and is it possible to improve the other areas when a lot of my site is from a 3rd party that displays real estate property pages individually that get indexed by Google on my site.

My latest test results… WebPageTest Test - Running web page performance and optimization tests...

Any suggestions for improvement will be greatly appreciated :D.

Not sure if you meant to post your reply in this topic. But what problems did you have with maxcdn and wordpress? I’ve successfully used it on our wordpress install just fine.

:dodgy: Sorry for going a little off topic about my CDN issue, but I just wasn’t sure if that would of helped my start render time, but I just wanted to throw in that I already tried that one. I am sure MaxCDN is great, but the one thing that has really helped my site was the cron job I set up with hyper cache so I really wasn’t all that keen to deactivate that and go to w3-total-cache…for some reason minify plugin makes all my scores go down too, so I am not sure why I get so much conflict with things like MaxCDN and the minify plugin…my pages and Thesis theme doesn’t display correctly with them?

Oh ok, sorry i just thought you had accidentally posted the reply in this topic but meant to post it in another (happens in forums a lot more than you’d think), so just ignore my above reply.
As for your issue, it may be true that those things together didn’t make your page work, but that is why you should try one optimization at a time. For example, your caching method is a completely separate topic from maxcdn. So i am not sure if maxcdn was the cause of your issues and i’d recommend giving it another try, it seriously helped lower my page load times. You should use it for any images you may have and if possible using it for your themes css file (using two different pull zones for better parallelization).

Fundamentally you need to get those css and js files combined into one of each to get your start render times. W3 is one solution that will do it for you and there may be others but that’s really the only optimization you should be looking at initially. You can try looking for another plugin that does the combining but still plays nice with hypercache - maybe PHPSpeedy: http://aciddrop.com/2008/12/15/php-speedy-wp-051-recommended-upgrade/