Uh-oh, we've been botspotted!

Some people have noticed in their access logs that there's a "googlebot" requesting files such as, "index.rdf" and "atom.xml" -- common names for XML feeds for blogs. In some cases, these files don't exist where they are requested from, and never have, leading some to speculate that this Google bot is not just crawling links, but looking specifically for feeds.


Others have pointed out that these filenames are common for two "RSS-like" feed formats, but not for other common types.


Could it be that Google is building some sort of feed search, but purposely dissing one (or more) of the most common formats, in order to get people to switch to one of two other formats?


That's pretty brilliant. I wish I'd have thought of it. Of course, since there is no clear business reason for Google to get people to switch between various flavors of open(ish) XML formats, and a clear disadvantage to limiting themselves to them, in terms of comprehensiveness, that would seem like a strange thing to do.

Is it more likely that this is not a calculated move, but that they are experimenting with crawling feeds in general and that, if they're going to index them, they probably want as many as possible? And that maybe (hmmm...) they started with Blogger blogs first, since they were handy, and they tended to find feeds at index.rdf and atom.xml, and they haven't yet optimized their crawler because they've been working on other stuff?

Tough call. And, naturally, I can't say confidently. No one tells me anything. And, plus, secrets and all that. Worth thinking about, though.