Community AMI windows server 2008 for test agents(California and Virginia)
I notice a huge gap in the waterfall which is not present in some of the old builds. This is IE specific.This gap doesn’t show up in firefox or chrome.
I have attached the screenshots. I updated the wpt test agents(latest version on May14th) and that doesn’t help. Is some one aware what could be the root cause?
[hr]
Both the screenshots are from the same URL, same test environment and same browser(IE10). It just got worst.
We got in touch with our CA as the OSCP verification was slow. It consistently took 400-500 ms. We compared the OSCP verification time with other CA’s and shared the findings with the vendor.
On the physical thinkpads it looks like the gaps are smaller but still way longer than I’d expect: http://www.webpagetest.org/result/140929_BE_19QJ/1/details/ (the gaps don’t look big enough to be validation checks but the SSL time is still really long)
The results from a Thinkpad look way more reasonable.
The gap for the www.apartmentlist.com is about 10x smaller than in the test I linked about, and is about 200ms. I would expect that to be normal OCSP delay, since www.apartmentlist.com does not support OSCP stapling (due to Heroku not supporting it at the moment). The gaps for cdnX.apartmentlist.com are even shorter (presumably because CloudFront supports OSCP stapling).
However, what I see from Denver location looks terrible. We have RUM (Real User Monitoring) enabled via NewRelic on our app, and we see that there is a large number of users that are experiencing similar performance performance in real life (20+ second page loads).
At first I thought there is something wrong with out certificate (we had one from RapidSSL), but I replaced it with the one from Comodo (total shot in the dark!) and absolutely nothing has changed in the waterfall and these gaps.
Any idea what I can do to get to the bottom of it?
So you have a support link / number about the OSCP stapling not being supported for custom domains on CloudFront? SSL Labs showing OSCP stapling enabled on cdn0.apartmentlist.com (which is powered by cloudfront). Perhaps the issue is already fixed?
Yes we have a ticket number and the latest update says AWS is investigating the issue and their product teams are on the case. It is an internal support case so you might not be able to view the ticket history/details.
What you say is correct. For apartmentlist.com OSCP stapling works even for custom domains. In our case OSCP stapling doesn’t work for custom domains. We cross checked both the URL’s in SSL Labs.
We also shared this finding with AWS(for apartmentlist.com OSCP stapling works).
I work with Sundeep, and we finally figured out what was wrong.
The AWS support team checked on their side and confirmed that everything worked well, but the way we did our tests for OCSP stapling was actually flawed.
The OCSP stapling happens at the CloudFront edge level so every node from an edge location needs to do it. The first request returns immediately a non-stapled answer and the node fires an OCSP request to our CA and then caches the answer which is then used for the subsequent requests hitting that edge node. (The nodes currently don’t share these caches, but I filed a feature request so that they try to use a shared storage for that content, like they do for static content.)
During our test we only executed a relatively small number of requests, so we never hit the same edge node twice and that’s why it appeared the stapling was broken. When testing you need to fire a few hundreds of requests until consistently getting stapled responses.
As for explaining the rest of the gap from our waterfall graph, we saw that it often happens that the CPU is maxed out while loading our app, for multiple reasons:
[list]
[]sometimes due to multiple SSL connections started at the same time, since they are quite heavy during the negotiation phase
[] while other gaps can be explained due to heavy CPU use during the parsing of Javascript and CSS content (our webapp code is around 400K when minified and compressed)
[/list]
When these overlap, the gap can grow even up to a few seconds, and it’s exacerbated if the testing machine is not powerful enough.