Help wanted: Apache/PHP

This entry was published at least two years ago (originally posted on July 31, 2003). Since that time the information may have become outdated or my beliefs may have changed (in general, assume a more open and liberal current viewpoint). A fuller disclaimer is available.

I’m planning on sticking with TypePad as my weblog host once everything opens up officially (tomorrow, from the looks of it). However, this poses a bit of a problem. While I’m slowly moving all of my old posts from my old weblog to this new site, there are still lots of links scattered throughout the ‘net that point to the old addresses.

I think I know of a solution, however, I’m not well enough versed in the intricacies of Apache and PHP to pull it off on my own. So, I’m asking for help!

Here’s what I’d like to do…

All of my old posts reside at my personal server at http://www.djwudi.com/longletter/. It’s a Mac OS X computer running Apache, with PHP enabled.

I know that Apache can handle redirects, based on rules set up in the httpd.conf file. I also know that pattern matching and text string munging can be carried out in PHP.

All of my old individual entry pages are stored in my webserver with the following directory structure:

http://www.djwudi.com/longletter/archives/year/month/day/dirified_post_title.php http://www.djwudi.com/longletter/archives/2003/07/31/help_wanted_apache_php.php

All of the pages on this new site are stored using a similar, but slightly different directory structure:

http://djwudi.typepad.com/eclecticism/year/month/truncated_title.html http://djwudi.typepad.com/eclecticism/2003/07/help_wanted_apa.html

What I’m envisioning for the final system is this:

  • Anytime my webserver receives a request for a page that resides within the ‘/longletter/archives/’ directory, Apache redirects to a customised PHP script on my server.
  • That script does three things:
    1. Presents a simple page to the user with wording to the effect of “This site has moved, one moment while we redirect you…”.
    2. Looks at the requested URI and converts it to what the new URI should be. As I’ve kept post titles consistent, and the directory structures are similar, this should be fairly easy with the right regular expressions.
      1. Parse the requested URI.
      2. Remove everything before the 4-digit year and replace it with the new base address.
      3. Remove the 2-digit day.
      4. Truncate the post title to fifteen characters.
      5. Remove the .php extention and replace it with .html.
    3. Redirects the users browser to the new, correct URI.
  • Hey presto, we’re done — no matter which page was linked to at my old site, the user has been redirected to the corresponding page at my new site.

More brainstorming:

  • The above method works well for links going to individual pages, but what about category archives or the main index page itself?
  • Could the PHP script be made smarter? For instance…
    1. If the requested URI contains the year/month/day/title.php string, then the above transformation and redirect is processed.
    2. If the requested URI contains any other string (in other words, it doesn’t point to a specific post), then a page is presented that says something along the lines of “This site has moved, one moment while we redirect you to the new site…”, and a redirect is passed to the user’s browser that points to the index page of the new weblog.

Anyway, that’s what I’d like to do. It all seems straightforward enough in my brain, and I think that the technology I have available should be able to handle it all without a problem — I just don’t have the faintest idea how to code it.

Any and all advice, hints, tips, or straight-up solutions would be greatly appreciated. I’m not rich enough to offer untold wealth or cool prizes or anything, but I can offer much gratitude, public thanks and kudos, and probably pizza and beer (or a PayPal donation to a ‘pizza and beer’ fund, or some such thing).

And you won’t even have to fight me for the beer — I can’t stand the stuff. ;)