Late April Fool’s Day Joke? Google to Crawl HTML Form Pages
Posted by jonathan at 2:30pm EST on 04/11/2008
When first reading the blog post published a few hours ago on the Official Google Webmaster Central Blog, the first thought that came to mind was, “hmmm, is this a late April Fool’s Day joke?”
Why? Because what they are saying seems completely illogical and frankly, stupid.
In the past few months we have been exploring some HTML forms to try to discover new web pages and URLs that we otherwise couldn’t find and index for users who search on Google. Specifically, when we encounter a <FORM> element on a high-quality site, we might choose to do a small number of queries using the form. For text boxes, our computers automatically choose words from the site that has the form; for select menus, check boxes, and radio buttons on the form, we choose from among the values of the HTML. Having chosen the values for each input, we generate and then try to crawl URLs that correspond to a possible query a user may have made. If we ascertain that the web page resulting from our query is valid, interesting, and includes content not in our index, we may include it in our index much as we would include any other web page.
note: emphasis was added by the author, not Google
So, looking at what they are saying is that basically, they will start completing HTML <form>’s on sites that they deem as “high-quality”. That alone sounds fishy.
Sure, anyone knows in the search engine optimization industry, that there are some sites genuinely better than others. For example, CNN.com, Time.com, and Apple.com all have a lot more “weight” with search rankings than a small site like mom-and-pops-corn-syrup-store.com has on search rankings.
The first issue I see is this: “For text boxes, our computers automatically choose words from the site that has the form”
What on earth? If a webmaster wants information to be available, they will make it available through the normal methods; they will not hide it behind a <form> that needs to be filled out.
Secondly, as the SEO Expert Michael Gray pointed out, it’s going to cause a lot of false-positive responses and contact forms being sent out.
I certainly do not want googlebot sending false-positives to my contact form inbox with irrelevant data.
Am I alone on this about thinking it’s a late April Fool’s joke?
