Google Name Server Problem

We’ve been using the San Jose test machine for months with relatively accurate results up until about 2 weeks ago. At that point we noticed high load times for certain images on the site and saw that these images were being pulled from our CDN’s (Akamai) server in Europe.

After going back and forth with Akamai we discovered that the San Jose machine you guys are using also uses Google name servers, which according to them are not accurately reporting the physical locations of these machines and thus pulling images from remote locations.

View Akamai’s response here:

Retrieved tier 1 diagnostics results from the customer Found the historical mapping information for the customer Found out that the Name Server of the Third Party tools that the customer is using is a GOOGLE INC. Name Server , and hence such requests are getting mapped not very optimally ( and out of the US at times).
Explained the talking points of the GOOGLE INC. Name Servers issue to the customer.
Customer confirmed that when they do a same test from another location on the test tool they get better performance,
had the customer submit the tier 1 diagnostics via the test tool from that location also - and confirmed that the Name Server was not a Google Inc. Name Server and hence the mapping was optimal. Explained to the customer to give this feed back to the Third Party Test Tool they are using.

One additional thing I should mention is that your newly added Los Angeles machine does not use Google name servers and is accurately pulling images from the correct locations and reporting the load times for the site that we’d expect to see.

Has anyone else using Akamai seen this issue?

Sorry, San Jose was using dedicated name servers until recently where the (savis I believe) servers lost their mind and tests started failing (followed with the machine itself going offline). I’ll see if we can switch back.

That said, if they can’t geo-locate Google’s DNS they have a very real problem for real users (not sure what the usage is for Google’s DNS is but I know it’s non-trivial). They have a very nice architecture with anycast routing and nodes all over the place so it works well for end users.

Akamai's DNS infrastructure on the other hand relies on being able to identify the physical location of the IP address of the customer's DNS server (Google's resolver in this case) from a database that they maintain and then make a routing decision based on that. If they ran a proper anycast DNS infrastructure of their own they wouldn't have to rely on lookups and a mapping database, the queries would route to the closest pop (or at least within the region if they didn't run DNS at every pop).

I’ve seen issues where the actual lookup times for Akamai-backed domains took 300+ms because of the way bind 8 rotates through authoratative servers and Akamai was answering the lookups from Asia - and this will get exponentially worse with bind 9.

Thanks so much Patrick - that explains a lot. - one other thing I noticed that happened around the same time is that were starting having image compression scores range from F to B - where we’ve always seens an A score in the past - did something change here as well?

Any chance you have a before and after test (or if you PM me the url I can check the test history). There have been a few pagetest updates over the last couple of weeks but nothing that should have changed the image compression check. The underlying png, jpeg and gif libraries may have changed with the upgrade to Page Speed 1.9 so I’ll have to look into it (though Page Speed itself isn’t used for the image checks).

I just cut the DNS back over the the ISP DNS servers so if you re-run your tests they should hopefully localize back to where they were before for Akamai.

Thanks,

-Pat

Looks like there may be an image decoding problem with the library swap-out. I’m seeing a lot of cases where images fail because they couldn’t be decoded - looking into it now so I hope to have a fix rolled out later today.

Thanks,

-Pat

Yep that’s exactly the problem were seeing. The test URL is http://www.marinedepot.com - I believe we started seeing the issue started around Oct. 10th. We have tests many weeks post and prior you could review.

ok, I just fixed the image decoding problem (fix one bug, introduce another - sigh). The fix should be deployed now so everything should be back to normal, sorry for the inconvenience.

-Pat

Awesome - Thanks Pat just ran some additional tests, everything looks great. Thanks again for all your help.