Start Render Time

I’ve been doing a complete site redesign for an ecommerce site with the main goal being speed. Now, considering a lot of the load time for such a site is spent loading images and where 90% of it is offscreen rendering, i feel the start render time is the most important figure next to completion time.
However, i am at a loss as to why my start render time is so high. I just tested and i found the current live site has a start render time that is on average 300 ms faster than the new site design! But i am not sure why this is, considering the new site is structured a lot better than the old site, yet the start render is triggered later…
http://www.webpagetest.org/result/100612_4c7e6d155d03f0c19bd577353d6fb315/1/details/

There is a test of the new site running. As you can see, the contents of the html and the css is downloaded long before the start render is triggered. But why is that so? Shouldn’t the browser only need the html source and css to start rendering? The only other items that follow are product images, which should have no basis on the start render time.
So anyone have any ideas?
Thanks

I’d take a look at your javascript. Do you have any inline? It looks like the CPU spikes ~1.1 seconds and stays really busy for quite a while. If I had to guess I’d say that was what was causing the render start to be delayed.

Dynatrace Ajax Edition is a really good profiler for javascript in IE.

I don’t have time to look right now but I’ll look a bit closer later this evening.

-Pat

Thanks, a closer look would be appreciated. But no, i have no inline javascript in the html source other than a quick check to see if jquery has been loaded or not, but that should only be executing if the browser is IE 6. So i am not sure where that spike is coming from.

Edit: Keep in mind when checking it out further that the page link for the test will only lead you to the old site. This test was done as a scripted test that pointed it to our test server. If you want the exact test server ip i can pm it to you.

Edit2: For a test i completely removed the entire IE conditional chunk of code, it had no affect on the start render time, so thats not the cause.

Keep in mind the fact that the start render time is affected by server side latency. I’m not sure if you are serving static HTML, but in the off chance that you are rewriting your URLs and there is some server on the back end doing the crunching that could affect the start render time. Especially if you are on a dev box and the server is less powerful. You have a relatively long Time To First Byte if this is just static HTML, so either it’s your server or the network I would guess.

How can the server side latency affect the start render time after the html and css has been completely downloaded? I understand server side latency may contribute to things such as time to first byte, but how would the url rewriting affect the start render time at the point when it has already downloaded the html and css? I ask because the url rewrite by definition is processed before the html can be downloaded and thus is part of the time to first byte time. However, if you look at the test output you can see the html and css are completely downloaded about 500 ms before the start render time kicks in, and all http requests after those two elements and before the start render time are product images, which should have no affect themselves on the start render time.
So, to be clear, i am wondering why there is a 500 ms delay between the html and css downloading, and completion, and the actual start of the rendering.
As a side note, i am using a duplicate account on one of our live servers. We have two servers for load balancing (using the least connection method for balancing) and failover, so the performance shown by the test will show the real world performance pretty accurately.

I could be wrong, but my understanding is that the start render time is a fixed point in time that is measuring from the initial request - this means that if your server latency goes up by 1 second your start render time will go up by one second.

In other words, the request is made at T = 0, maybe the server finishes crunching at T = 0.6, the JS/CSS downloads finish at T = 1.5, and thus the start render time is listed at T = 1.5, or 1.5 seconds. I would imagine that the start render time you are seeing (1.485s) includes the TTFB.

Your question is valid though, if you are seeing a 500ms delay between the end of the HTML download and the start of the render that wasn’t there before then that is a problem. Unfortunately without access to the site or to a waterfall chart from the original site it is difficult to do further troubleshooting. Usually this kind of delay shows up because of JavaScript, but you say you don’t have any inline JS and the only external request is much farther down the page. I think the bottom line is that we need access to the code in order to troubleshoot further.

Yeah, we may have to wait for pmeenan’s word on this one, but when i read the “content download” figures and their associated times, i interpret that as this file took this much time to download. So, if you read the waterfall that way it means that both the html and css have been requested (which includes the time to first byte) and downloaded fully by around 900 ms. Aka:
HTML: dns lookup is 54 ms, the initial connection is 89 ms, the time to first byte (mostly server processing) is 466 ms, and the content download for it is 144 ms
CSS: the intial connection is 137 ms, the time to first byte is 211 ms and the time to download it is 2ms.
Which means that at 900 ms to 1453 ms i don’t know what its doing, as at that point in time server latency should have no affect for the basic rendering of the html structure and its css styles. But who knows.

Yes quite true. And just to make sure i did a test without any external javascript and it still had pretty much the same start render time

Yes, if you don’t get any html back from the server, nothing to render :slight_smile:

In this case I got a look at the test site and it looks like it is a nested table structure to do the site layout (and a non-fixed table layout) so the CPU spiking and likely the start render delays are because of the browser calculating and re-calculating the layout. Moving to more of a div-based layout (except for where it is really a table) will help significantly and if at all possible, moving to a fixed table layout with deminsions all specified will also help.

Run a video test and you’ll see the table layout changing as the page loads too so a fixed layout will yield a much better user experience.

The start render time is actually the time the browser draws the first bit of content to the screen (it’s not calculated, it is done by hooking the drawing API’s and waiting for the browser to draw something that isn’t a white background).

By definition the browser can’t render anything until it has received something from the server so TTFB on the base page is critical to start render but not the only contributor. It needs to completely parse the head and then calculate the layout before anything can show up on the screen.

Oh ok, that makes sense then. I will experiment with a div based layout and see how that affects the render time. The table based layout is whats leftover from the old site structure, sense it fit well enough into the new one i didn’t see a need to change it, but an improvement to the render time is a definitely a good reason. Thanks again for the help!

If you don’t want to overhaul the layout you can stay with the table layout but switch to a fixed layout and specify the dimensions (though if you are overhauling and modernizing the site, switching to a css-positioned layout is a good idea).

Yeah, i’ll probably change it to use divs. Its the only thing that is using tables at this point, if i remember correctly, and is generated by the product listing module. So its easy enough to simply change the module’s output and have it change all over the site. Though i am not sure how to do an arbitrary amount of products stacked on top of each other, with three per row, using div’s, but i’m sure i can figure it out.

On additional note on the subject of tables: Marissa Mayer from Google gave a talk last year at Velocity saying that “tables are purely evil” since the browser needs to wait for the closed table tag before it can start rendering anything. You can view her talk and get more information here:

http://velocityconference.blip.tv/file/2290442/

Thanks, that video is quite helpful. Though after looking at the code that is generating the product listings i am not sure if i can fit in changing it all to a div based layout by my deadline, it seems that the code will have to be completely redone to work with a div based output scheme. However, it looks like i can still get around that in a similar way google did, splitting a table into multiple tables so that the output is rendered in table chunks rather than one large table. I already experimented with specifying exact table and cell dimensions and that has indeed helped with the start render time, even with a table layout.
However, i’ve been researching and see that chunked encoding may enable progressive web page rendering through outputting the html doc as its being generated. This way one could output the entire head block, allowing the download to start for items in the head like the main .css file, while still processing the lower half and outputting that when done. This is something that could definitely help speed up the start render time. However, what little info i can find about it says that it should be enabled by default in apache, yet looking at my http headers i see no mention chunked encoding. Do you know of any info pertaining to enabling chunked encoding if it isn’t enabled? I also wonder if its compatible with gzip encoding, it could be that mod_gzip is buffering the output for encoding and thus prevents chunked output. I also messed with php’s flush command, which should have a similar effect, but also got no results.

Do you control the server or are you on shared hosting? I haven’t had much luck getting it to work on my shared hosting provider but it worked fine on a stand-alone install of apache 2.2 with mod_php.

For chunked encoding to be useful you need to flush the document out early from whatever code you are using to generate the pages (php from the sound of it). Then there’s the matter of getting it to actually work through all of the parts of the server pipeline. Recent builds of Apache and mod_deflate both work with chunked encoding but it’ll probably take a far bit of experimentation to get it working. They both go hand-in hand and it doesn’t help unless you do both (in php you need more than just flush(), you need to look at ob_flush as well.

What it will buy you is reducing the time to first byte on the base page (assuming the time to generate the page and make your back-end calls is not insignificant) which will help with all of your other times.

I have full control of the servers so i should be able to get it working no matter what configurations are needed. Though with php i thought ob_flush was only needed if you were manually buffering the output in php, but i’ll keep that in mind just in case it is needed.
I’ve run page generation times in the past and found that most pages take around ~0.095 seconds but some a bit longer (aka 0.13 or so) to generate for a category page with 50 products. So there is some benefit to be had, though i can try to optimize the backend code to generate the page faster, but when you’re dealing with 50 products per page it gets a bit tough, though small changes do scale nicely in such situations due to a small performance increase being 50 fold.
At what point in page generation speed do you think it wouldn’t be that beneficial or would be negligible to use chunked encoding? Currently the average page generation time is about 20% that of the average time to first byte.

Since you control the servers the only reason NOT to do it would be if you need to be able to modify the response headers later in the page (setting cookies, etc) since that becomes unavailable once you flush.

Stoyan has a little bit of a write-up with a few things to check here: http://www.phpied.com/progressive-rendering-via-multiple-flushes/

It could be that a lot of people disabled deflate and used php’s output buffering to do the gzip compression (when they couldn’t get it working on shared hosting).

http://www.webpagetest.org/result/100619_a3f3da1c99a7fed5ec494ae1e5c77991/1/details/

With the use of maxcdn, the extra domains/subdomains and properly maxing out the connections between them i was able to get the start render time to 1.287 seconds and the page load time of 1.980 seconds. Thats a pretty good number considering the page i’ve been testing is our site’s average worst case, where they load up a category with at least 50 products showing.
As for progressive render, if you look at the waterfall i think the php flushing is actually working. The main html doc hadn’t completely loaded, yet the .css file was requested from the cdn. To me this is good enough and a reason not to implement server-level chunking as I personally prefer the granularity of flushing output to the user in the application. This way i can try to flush the output at more optimal times than the server might.
On a random note, is there a way to speed up the initial connections? Or is it mostly just network delay that i can’t help?

Initial connections are all network propagation and short of actually distributing your front-end servers there’s not much you can do to speed it up. A CDN makes the connections for static objects faster by putting hosts physically close to the user (speed of light problem, that’s the only solution).

The flushing is actually not working from the looks of it (but it would be worth getting it working). The browser will always start parsing the html as soon as it starts coming in which is what you are seeing but what an early flush will get you is it will reduce that first byte time significantly (theoretically could get to be as fast as the connect time but in practice it will always be a little longer).

If you don’t see the chunked encoding in the header then the early flush didn’t work. The reason is that without chunked encoding the web server needs to know the full size of the response before sending it (and as a result, nothing gets flushed until the whole response is ready). With chunked encoding it can stream the response as it’s ready (allowing for early flushing).

Nice work btw, the progress is great.

Oh i see. Well in that case i have to admit i am stuck. Again what little info i can find says that chunked encoding should be enabled by default. Yet it doesn’t matter what type of application level flushing (tried every method i could find) nothing happens. Though some things may be working against the use of chunked encoding. For one we’re using the zues load balancer/reverse proxy software, which should support chunked encoding but who knows. Then we have an older version of apache, we still have yet to move to apache 2 due to other dependencies, though from what i’ve read our version should still support chunked encoding. I guess i’ll have to investigate everything within the chain and make sure.
On a side note, i believe google is using chunked encoding/flushing their output earlier on, and while i see that no content length header is sent in their case i still do not see a chunked encoding header being sent by them either. You sure the header is necessary for chunked encoding or just that no content length needs to be given so that the output can be streamed?