Search improvements

This entry was published at least two years ago (originally posted on February 28, 2003). Since that time the information may have become outdated or my beliefs may have changed (in general, assume a more open and liberal current viewpoint). A fuller disclaimer is available.

I spent some time last night working with the search software I have installed on djwudi.com, tweaking and improving it so that it gives much more useable results.

While MovableType does include its own search function, I’ve chosen not to use it for djwudi.com because I have a number of pages that live outside of my weblog, which MT would not be able to search. However, I’d run into a bit of a problem with the search engine I am using, and I think I’ve finally got it solved.

The issue that came up was simply that because the search software had indexed the text of every page on the site, there were certain words that were essentially useless to try to search for, because they’re repeated on so many pages. For instance, I was trying to find a page where I’d written up a short description of the MT TrackBack functionality — unfortunately, a search for ‘TrackBack’ returned hits for every single page on my weblog, because they all had the word ‘TrackBack’ on the page.

Digging through the documentation for the search software yesterday (yes, I know, actually reading the instructions is so uncool, but it really does help sometimes…), I discovered that there is a very simple way to tell the search software to ignore certain areas of a webpage. So, some tweaks to my templates to ensure that the software only pays attention to the actual content of each page, and ignores all the navigational or presentational mumbo-jumbo, and I’ve got a far more useable search feature than I did previously. Woohoo!