Chasing Googlebot complaint about Server error: Connection reset

Google Help for Server error: Connection Reset ends with this intriguing advice, "If the problem persists, check with your hosting provider."

Yeah but what if you ARE the hosting provider? What exactly are you looking for, and where?

One useful step was to turn on the Event Viewer logging of ALL reasons that an AppPool can recycle. This is done under Advanced Settings of the AppPool (in IIS7.5).

The second useful step was to use a regular expression to focus in on the requests that were giving googlebot 301, 302, 404 etc errors.

googlebot([^\r]*)? (40\d|30[12]) \d \d \d{2,4}\r

This matches against an IIS log line such as:

2013-01-01 04:23:15 208.201.252.8 GET /Faqs/Domain.aspx - 80 - 66.249.74.91 Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html) 404 0 2 56

What I like about this regex is that it pulls out all the 40x codes plus 301 and 302 without showing me 304 (unmodified).

Or to get just those annoying bounces:

googlebot([^\r]*)? (30[12]) \d \d \d{2,4}\r

My tool of choice for regex searching is FuzRegex, which is FREE for searching, but you can use the above expression in many tools.

Timing: 5 seconds to find 103 matches in 66 files out of 351 scanned. 490MB of *.log files.

Follow-up

Problem resolved. The requests in question never showed up in the IIS logs at all, so the above redirects were red herrings. The actual problem requests died inside IIS 7.6, without surfacing anything in the Event Viewer logs either. WireShark confirmed that the http requests arrived at the server. The cause of this strange behavior was a custom ISAPI filter which worked in IIS 2 through 6 yet failed to do status 301 and status 302 redirects in IIS 7 due to a slight change in the requirements for one of the ISAPI filter calls.

The key to finding this was accidental but obvious in retrospect: test URLs that might redirect while masquerading as googlebot using Firefox with User Agent Switcher. It is then obvious that those requests fail (connection reset is reported in Firefox) and that the IIS log knows nothing about the request at all.

Comments

Popular Posts