Googlebot Refreshing Supplementals?
Weird 404 error email I sent myself yesterday (url hidden to prevent linking to an adult domain):
HTTP_REFERER: [blank] HTTP_HOST: www.domain.com PHP_SELF: /fgdfgfert4534.html REQUEST_URI /NONEXISTENTURL.html REMOTE_ADDR: 66.249.65.69 TIMESTAMP: 5/24/2006 9:15 PM
Quick explanation: I rigged my dynamic pages, so a request to retrieve “maroon-widget.html” 404s and triggers an email if I don’t have “maroon widget” in my database.
REQUEST_URI is linked from nowhere; it exists solely in the supplemental index. I’ve seen Yahoo do this kinda thing, but this week I’m starting to see Google do the same thing. I guess Google’s basically crawling my site using its own database instead of following links. Is this a common behavior/part of a normal crawl, or is Google trying to clean up supplementals?
Looking up 66.249.65.69 in Google returns 208,000 jibberish results (mostly those that display your IP on their page). So I guess its just a regular bot, not supplemental bot?
Update: After cleaning out my inbox, I found similar emails going back to May 20.
Hello Halfdeck
i check ur posts (regarding supplemental pages)in google groups and like it.
one question. can i ask you to make review of my site?
i understand, that u have huge knowledgde in this area, that is why i give u this question.
if u don’t like to do this .. NO problem ;-)
thank you :-)
alexo said this on October 19th, 2006 at 10:47 pm
Hi,
Sure. I may publish the review on this blog though, if you don’t mind. I’ll shoot you an email later today.
Halfdeck said this on October 23rd, 2006 at 4:57 pm