Velocity Wishlist

Well you did ask at the BOF session if anyone had any suggestions :smiley:

Here are some (in no particular order) that occurred to me as a result of the talks at Velocity:

  1. Repaints & Reflows: Is it possible to analyse JS/CSS for patterns that will cause a repaint or a reflow? Rather than instrumenting to capture the repaints and reflows would pattern matching, similar to the jquery select by ID test, provide a decent enough indicator of the number of repaints/reflows that might occur.

  2. Include time to first byte weighting in the single number overall score thing. I appreciate that you were getting pushed to provide a single number as a performance indicator at the BOF somewhat grudgingly. But if such a number is going to be provided it would be good to take the website backend performance into account to give an indicator of overall performance. Yslow/Pagespeed etc only focus on the frontend with respects to their score so a 7 second page that has a 5 second backend response time can get a really high score if it just optimises the frontend bit leaving you with say a 6 second overall load time for the page…which is hardly optimal. If there is a large backend response time then suggest early flushing.

  3. Detect early flushing. Progressive rendering was a big thing this year and it builds on the previous best practice of just flushing the head of the html document. So there are really 2 parts. First off detecting if the html document head is flushed early as this can help jumpstart the TCP slowstart issue as well as providing early download of external resources, and if you include the banner of the page early visual feedback as well. Secondly if the whole html body is wrapped in a root div and of certain size/complexity suggest breaking the page into sections and flushing each section, giving you progressive rendering of the page providing early visual feedback for end user, since the root div will block rendering of the body until complete.

  4. Nested DNS CNAME hack. Check DNS requests and if 2 or more records point to the same IP suggest the nested DNS CNAME hack which would collapse the multiple DNS requests into one.

  5. Check for a complete SSL certificate chain. if a complete chain is not provided then the browser has to go off and fetch the intermediaries causing delays.

  6. Check for SSL OSCP stapling. This relates to the length of time that a certificate can be trusted (mostly 7 days). Usually the browser has to go and fetch this from the CA however it is possible for the server to prefetch this and provide this information to the browser with the certifcate preventing a delay. you need the OSCP for each cert in the chain.

  7. Check SSL certificate chain size. It should be small enough to fit in the initial TCP congestion window. If it is bigger then an extra RTT will be required before the SSL handshake can complete. This is bad for sites with high latency.

  8. On the note of latency is it worth calculating the latency and following that the max effective bandwidth between the various domains of a webpage and webpagetest.org. This will give an idea of poorly located domains, give strength to arguments for CDNs, etc.

  9. Support for grabbing a timing event created by JS on the webpage to show Time To Interactivity. It was another recurring theme that people are looking at the Time To Interactivity (other names were also used) as a marker for when the page was usable by the end user. Building on the principles of Steve Souder’s episodes it would be good to have various performance metric libraries create an industry standard metric event on the webpage that would capture the website defined Time To Interactivity (pie in the sky anyone :-)), Time to Interactivity (or whatever you want to call it) seems like a good candidate for the first metric that everyone can agree on. Support could be added on webpagetest.org for a defined JS event that people could then include in their metric package if they wanted to see the information displayed on webpagetest.org when doing their testing.

thoughts?

Cal

“nested DNS CNAME hack”

What’s that?

Thanks

Neil

The nested DNS CNAME hack uses nested aliases to provide information
for multiple domains in a response to one DNS lookup.

This is for domains that point to the same IP address

Normal Setup (using pseudo syntax)

using a records

www.example.com → 192.168.0.1
static1.example.com → 192.168.0.1
static2.example.com → 192.168.0.1
metrics.example.com → 192.168.0.1

or using aliases (CNAMEs)

www.example.com → 192.168.0.1
static1.example.com CNAME www.example.com
static2.example.com CNAME www.example.com
metrics.example.com CNAME www.example.com

nested CNAME hack setup:

www.example.com CNAME static1.example.com
static1.example.com CNAME static2.example.com
static2.example.com CNAME metrics.example.com
metrics.example.com → 192.168.0.1

So a DNS lookup for www.example.com will give you the results for static1, static2 and metrics as well. This means that potentially 4 DNS Lookups have been collapsed into 1. Also you can enter the stact at anypoint and get your required result (eg a direct DNS lookup of static2.example.com will still work).

Cheers
Cal[hr]
it automagically added in the http:// parts…that wasn’t me :slight_smile:

Does this slow down the initial DNS lookup at all? My gut reaction is that it wouldn’t, or that it would be a vanishingly small change (a couple ms) but if you have data on it that would be great. Thanks.

I don’t have any data on this unfortunately. It was a new hack that was presented at Velocity that I thought was really cool. It was presented by Tom Hughes-Croucher of Yahoo in this talk http://en.oreilly.com/velocity2010/public/schedule/detail/13096

The response for the DNS request should be able to fit in one tcp window response from the server unless you have a big stack. But I am surmising here, I have no data. But even if the initial DNS lookup is slowed slightly you still gain the benefits of having removed 1 or 2 other DNS requests which IMO will far outweigh any slight slowdown in the initial request.

Cheers
Cal

I could see where it would be really helpful in the case where you just have alternate CNAMEs for your main domain in order to do sharding for static resources. You could pre-populate the static domains in the original response.

If you are using a CDN or alternate hosts (i.e. a different IP) for the static domain then things get a little fuzzier. It probably wouldn’t hurt anything but for it to be of any benefit you would need to make sure the first external request was at the top of the chain (likely a css file) so it takes some coordination. Additionally, if things were looked up in parallel before then you might not see any benefit. The perfect request for doing the pre-caching with is a blocking request at the beginning of the waterfall (the kind that we try to get rid of).

The DNS responses are UDP so you don’t have to worry about TCP slow start or any windows, just the packet sizes to prevent fragmentation and a DNS record with multiple alianses chained together should be really quick. Most DNS servers will return all of the records in a single response (and it will be cached for the TTL along the chain) so it shouldn’t add any measurable time to the lookup.

Very cool hack - I was really impressed when I heard about it.

Doh, I’m sure I know that! . schoolboy error by me :blush: