{"id":3087,"date":"2005-01-19T19:09:25","date_gmt":"2005-01-20T03:09:25","guid":{"rendered":"http:\/\/michaelhans.com\/eclecticism\/2005\/01\/19\/prior-art-for-nofollow-blocking\/"},"modified":"2019-12-12T13:33:35","modified_gmt":"2019-12-12T21:33:35","slug":"prior-art-for-nofollow-blocking","status":"publish","type":"post","link":"https:\/\/michaelhans.com\/eclecticism\/2005\/01\/19\/prior-art-for-nofollow-blocking\/","title":{"rendered":"Prior art for &#8216;nofollow&#8217; blocking"},"content":{"rendered":"<div class='__iawmlf-post-loop-links' style='display:none;' data-iawmlf-post-links='[{&quot;id&quot;:9190,&quot;href&quot;:&quot;http:\\\/\\\/www.dashes.com\\\/anil\\\/2005\\\/01\\\/19\\\/the_social_impa&quot;,&quot;archived_href&quot;:&quot;https:\\\/\\\/web-wp.archive.org\\\/web\\\/20070408103616\\\/http:\\\/\\\/www.dashes.com:80\\\/anil\\\/2005\\\/01\\\/19\\\/the_social_impa&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[{&quot;date&quot;:&quot;2026-03-10 17:20:16&quot;,&quot;http_code&quot;:206}],&quot;broken&quot;:false,&quot;last_checked&quot;:{&quot;date&quot;:&quot;2026-03-10 17:20:16&quot;,&quot;http_code&quot;:206},&quot;process&quot;:&quot;done&quot;},{&quot;id&quot;:9191,&quot;href&quot;:&quot;http:\\\/\\\/www.bradchoate.com\\\/weblog\\\/2002\\\/02\\\/18\\\/restricting-google&quot;,&quot;archived_href&quot;:&quot;https:\\\/\\\/web-wp.archive.org\\\/web\\\/20161229030911\\\/http:\\\/\\\/bradchoate.com\\\/weblog\\\/2002\\\/02\\\/18\\\/restricting-google&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[{&quot;date&quot;:&quot;2026-03-10 17:20:17&quot;,&quot;http_code&quot;:404},{&quot;date&quot;:&quot;2026-04-07 20:18:00&quot;,&quot;http_code&quot;:404}],&quot;broken&quot;:false,&quot;last_checked&quot;:{&quot;date&quot;:&quot;2026-04-07 20:18:00&quot;,&quot;http_code&quot;:404},&quot;process&quot;:&quot;done&quot;},{&quot;id&quot;:9192,&quot;href&quot;:&quot;http:\\\/\\\/www.xav.com\\\/scripts\\\/search&quot;,&quot;archived_href&quot;:&quot;https:\\\/\\\/web-wp.archive.org\\\/web\\\/20181119110701\\\/https:\\\/\\\/www.xav.com\\\/scripts\\\/search\\\/&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[{&quot;date&quot;:&quot;2026-03-10 17:20:20&quot;,&quot;http_code&quot;:404},{&quot;date&quot;:&quot;2026-04-01 18:31:34&quot;,&quot;http_code&quot;:404},{&quot;date&quot;:&quot;2026-04-07 20:18:10&quot;,&quot;http_code&quot;:404}],&quot;broken&quot;:true,&quot;last_checked&quot;:{&quot;date&quot;:&quot;2026-04-07 20:18:10&quot;,&quot;http_code&quot;:404},&quot;process&quot;:&quot;done&quot;},{&quot;id&quot;:9193,&quot;href&quot;:&quot;http:\\\/\\\/www.xav.com\\\/scripts\\\/search\\\/help\\\/1048.html&quot;,&quot;archived_href&quot;:&quot;&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[],&quot;broken&quot;:false,&quot;last_checked&quot;:null,&quot;process&quot;:&quot;done&quot;},{&quot;id&quot;:9194,&quot;href&quot;:&quot;http:\\\/\\\/www.xav.com\\\/scripts\\\/search\\\/changes_0049.html&quot;,&quot;archived_href&quot;:&quot;https:\\\/\\\/web-wp.archive.org\\\/web\\\/20160828030909\\\/http:\\\/\\\/www.xav.com\\\/scripts\\\/search\\\/changes_0049.html&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[{&quot;date&quot;:&quot;2026-03-10 17:20:32&quot;,&quot;http_code&quot;:404},{&quot;date&quot;:&quot;2026-04-07 20:18:03&quot;,&quot;http_code&quot;:404}],&quot;broken&quot;:false,&quot;last_checked&quot;:{&quot;date&quot;:&quot;2026-04-07 20:18:03&quot;,&quot;http_code&quot;:404},&quot;process&quot;:&quot;done&quot;}]'><\/div>\n<p>With the addition of <code>rel=\u201cnofollow\u201d<\/code> to our arsenal of anti-spam tools, there&#8217;s a certain level of chatter about the ability to add a block element to a webpage to delineate certain areas of the page that should not be indexed by Google or other search engines.<\/p>\n<p>Most of the time I see this <a href=\"http:\/\/www.dashes.com\/anil\/2005\/01\/19\/the_social_impa\" title=\"The Social Impacts of Software Choices, last paragraph\">mentioned<\/a>, credit has gone to <a href=\"http:\/\/www.bradchoate.com\/weblog\/2002\/02\/18\/restricting-google\" title=\"Restricting Google\">Brad Choate&#8217;s post<\/a> from Feb.\u00a02002 for first advancing the idea. However, the idea itself dates as far back as Jan.\u00a02001 in Zoltan Milosevic&#8217;s <a href=\"http:\/\/www.xav.com\/scripts\/search\/\" title=\"Fluid Dynamics Search Engine\">Fluid Dynamics Search Engine<\/a>, a shareware site-specific search engine.<\/p>\n<p>I used the FDSE on my site for a while (starting <a href=\"https:\/\/michaelhans.com\/eclecticism\/2002\/02\/archive_tweaks.html\" title=\"Archive tweaks, search engine online\">Feb.\u00a06, 2002<\/a>), and found its support for blocking sections of pages from the search engine to be <a href=\"https:\/\/michaelhans.com\/eclecticism\/2003\/07\/help_search_eng.html#finetune\" title=\"Fine tune what sections of a page get indexed\">incredibly useful<\/a>.<\/p>\n<p>For instance, the sidebar on my site changes frequently: on the front page, the linklog updates often, somtimes multiple times a day; and on the individual pages, the &#8216;related entries&#8217; list can change over time as new entries are added and the pages are rebuilt. Because of this, it&#8217;s not uncommon for me to see people arrive through Google searches for terms that <em>were<\/em> in the sidebar of a particular page when Google&#8217;s spider crawled my site, but have since disappeared.<\/p>\n<p>In another situation, try using Google to search my site for an instance of when I&#8217;m actually talking <em>about<\/em> TrackBack: as the term &#8220;TrackBack&#8221; is on <em>every single individual entry page<\/em>, the noise to content ratio is weighted in entirely the wrong direction. If I had the ability to block off the sidebar and the TrackBack section header, these problems could be avoided.<\/p>\n<p>FDSE allowed me to do just that &#8212; and part of what I liked about it was that it used the same syntax as the standard robot commands used in <code>robots.txt<\/code> files or <code>meta<\/code> tags. From the <a href=\"http:\/\/www.xav.com\/scripts\/search\/help\/1048.html\" title=\"How to prevent sections of your pages from being indexed\">FDSE Help Pages<\/a>:<\/p>\n<blockquote><p>\n  FDSE supports the proprietary &#8220;robots&#8221; comment tag. This tag allows a web author to apply robots exclusion rules to arbitrary sections of a document. The tag has one attribute, content, with the following possible values:<\/p>\n<ul>\n<li><code>noindex<\/code> &#8211; the text enclosed in the tag is not saved in the index<\/li>\n<li><code>nofollow<\/code> &#8211; links are not extracted from the text enclosed<\/li>\n<li><code>none<\/code> &#8211; enclosed text is not indexed nor searched for links<\/li>\n<\/ul>\n<p>  Values &#8220;index&#8221;, &#8220;follow&#8221;, and &#8220;all&#8221; are also valid. In practice they are ignored since they are the unspoken defaults.<\/p>\n<p>  This feature is expected to fit the customer need of preventing certain parts of a document &#8211; such as a navigational sidebar &#8211; from being included in the search.\n<\/p><\/blockquote>\n<p>Example:<\/p>\n<pre><code>&lt;HTML&gt;\n&lt;BODY&gt;\n\n    This text will be indexed.\n    &lt;a href=\"foo.html\"&gt; this link will be followed &lt;\/A&gt;\n\n    &lt;!-- robots content=\"none\" --&gt;\n\n        This text will NOT be indexed.\n        &lt;a href=\"bar.html\"&gt; this link will NOT be followed &lt;\/A&gt;\n\n    &lt;!-- \/robots --&gt;\n\n    &lt;!-- robots content=\"noindex\" --&gt;\n\n        This text will NOT be indexed.\n        &lt;a href=\"bar1.html\"&gt; this link WILL be followed &lt;\/A&gt;\n\n    &lt;!-- \/robots --&gt;\n\n    &lt;!-- robots content=\"nofollow\" --&gt;\n\n        This text WILL be indexed.\n        &lt;a href=\"bar1.html\"&gt; this link will NOT be followed &lt;\/A&gt;\n\n    &lt;!-- \/robots --&gt;\n\n    la la la\n\n&lt;\/BODY&gt;\n&lt;\/HTML&gt;\n<\/code><\/pre>\n<p>For the example of a navigational sidebar, the &#8220;noindex&#8221; vale would be the best choice.<\/p>\n<p>This syntax was designed to match the robots META tag.<\/p>\n<p>For documents which have both the &#8220;robots&#8221; META tag and the &#8220;robots&#8221; comment tag, the most restrictive interpretation will be made, always erring on the side on not indexing or not following.<\/p>\n<p>According to the above cited help documentation, Milosevic introduced this functionality in v2.0.0.0031 of the FDSE, and a quick check of FDSE&#8217;s <a href=\"http:\/\/www.xav.com\/scripts\/search\/changes_0049.html\" title=\"FDSE Version History\">version history<\/a> dates that release to Jan.\u00a026th, 2001 &#8212; four years before even a hint of its functionality was added to the major search engines, and just over a year before Brad&#8217;s post went up (no disrespect at all is meant to Brad here &#8212; different people have the same ideas fairly often, after all, and it&#8217;s an equally good idea no matter who came up with it &#8212; I&#8217;m just trying to give credit where credit is due, since this is a technique I&#8217;m actually familiar with).<\/p>\n<p>Obviously, I&#8217;m fairly happy about seeing <code>rel=\u201cnofollow\u201d<\/code> gain support with Google and the other search engines. Equally obviously by this point, I&#8217;m sure, I&#8217;d <em>love<\/em> to see a block-level implementation made available, and I think Milosevic had a good approach. It&#8217;s easy to implement, follows already established conventions (<code>robots.txt<\/code> and <code>meta<\/code> tags), validates (as it&#8217;s simply an HTML comment), and allows for a little more control than a simple on\/off ignore switch would.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Most of the time I see block-level &#8216;nofollow&#8217; mentioned, credit has gone to Brad Choate&#8217;s post from Feb. 2002. However, the idea itself dates as far back as Jan. 2001 in Zoltan Milosevic&#8217;s Fluid Dynamics Search Engine.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2040],"tags":[599],"class_list":["post-3087","post","type-post","status-publish","format-standard","hentry","category-blog","tag-weblogs"],"_links":{"self":[{"href":"https:\/\/michaelhans.com\/eclecticism\/wp-json\/wp\/v2\/posts\/3087","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/michaelhans.com\/eclecticism\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/michaelhans.com\/eclecticism\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/michaelhans.com\/eclecticism\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/michaelhans.com\/eclecticism\/wp-json\/wp\/v2\/comments?post=3087"}],"version-history":[{"count":0,"href":"https:\/\/michaelhans.com\/eclecticism\/wp-json\/wp\/v2\/posts\/3087\/revisions"}],"wp:attachment":[{"href":"https:\/\/michaelhans.com\/eclecticism\/wp-json\/wp\/v2\/media?parent=3087"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/michaelhans.com\/eclecticism\/wp-json\/wp\/v2\/categories?post=3087"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/michaelhans.com\/eclecticism\/wp-json\/wp\/v2\/tags?post=3087"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}