Final comment tweaks

Some few final tweaks to the comments tonight, and I think I’m finally done tinkering. For now, at least. There’s always more projects coming down the line somewhere. :)

As I mentioned before, comments will now automatically turn off after 30 days. Most conversations only really continue for a day or two after a post goes up anyway, and this limits the number of entries on my site that can be targeted by comment spammers. I’ve decided to go ahead and leave TrackBacks open, however, for two reasons. Firstly, there are posts that will continue to be relevant as time goes by, so I don’t mind getting pings long after a post originally went up; and secondly, turning off TrackBack pings also removes them from the page entirely, and I’d prefer to keep them visible.

I’ve also re-installed Adam Kalsey‘s SimpleComments plugin, which integrates comments and trackbacks together. This way, rather than having all TrackBack pings listed together above the comments, there is one single chronological list that combines both.

Lastly, I’ve integrated Gravatar support, so those of you who have Gravatar icons will now see them displayed along with any comments you leave here.

iTunesSpace Food” by Tai-Fun from the album Essential Chillout (1999, 6:57).

Let’s try this again, shall we?

Allrighty then. I’ve done some restructuring and work on the server, and it’s time to bite the bullet and see how things go: comments are turned on again. Or, at least, they’re turned on for this entry and any going forward.

I have implemented Conversation Killer, so comments and TrackBacks will automatically close on any entry after one month. While I still wish that I could just leave comments on indefinitely, hopefully this will be an acceptable middle ground (and, really, it’s rare that a comment thread continues after a month anyway, so I’m okay with this approach).

There’s a little more tweaking to do, but we’re off to a good start. I’ll keep an eye on my server to see how things behave, but with any luck, this will put me back in business.

iTunesWork It! Dance = Life (full mix)” by Various Artists from the album Work It! Dance = Life (full mix) (1996, 1:09:44).

That’s a lot of text

Fun fact for today regarding this weblog: a full export from MovableType of every post and every comment results in a 12.7 Mb text file containing 3,117 posts and 8,178 comments and trackback pings.

Wow.

I now have a WordPress site up and running that has all of my old entries (up to, but not including, this one) that I’m starting to poke around with to see what I think. So far, it’s been an interesting experience. The installation was dead simple. Importing my MovableType archives took some work — the PHP script kept timing out on the 12.7 Mb file, so I ended up having to break it into chunks and import six months at a time. That was likely more of a reflection on my server and the huge amount of data I was feeding it than WordPress itself, though.

I have the Staticize plugin installed, and it does seem to be making a difference: the initial load of any particular page takes a few seconds, but then any subsequent loads are pretty zippy as it can pull from the cached file (as a test, I even have all comments and trackback pings displaying on my posts about losing my position at Microsoft, and the load times are still bearable).

The WordPress interface takes some getting used to after years of familiarity with the MovableType interface. It’s certainly useable, and I like a lot of the options that are available to me, I’m just not quite as fond of the overall user interface (however, part of that might just be my fondness for sans-serif fonts, as the WP UI uses all serif fonts — to my eye, it feels more cluttered). Still, it’s the functionality that’s the key point, and it doesn’t look like I’ll have any worries there.

I’m not putting the WordPress blog live just yet (though, to be honest, if you’re really curious, it shouldn’t take too much guessing to figure out the URL), as I haven’t done anything in the way of customization just yet. At minimum, I want to make sure that my blogroll is set up and most (if not all) of the goodies in my sidebar are active. Diving into the design may end up taking a little longer — right now it’s using the default Kubrick theme, and while I’d prefer to move the current ‘distressed’ look and feel of this site over, I’m going to have a lot of relearning to do as years of familiarity with MovableType tags and design techniques battles with figuring out the WordPress tags and design techniques.

It’s a promising start, though I’ll freely admit that the real test won’t come until I bring the site live and find out just how my server reacts. I don’t know exactly when I’ll do that — part of me wants to just throw it in (sink or swim!), another part of me wants to make sure it’s perfect before I switch over, another part of me is still waffling over moving away from Movable Type, and yet another part of me thinks that I should concentrate on what I need to do to split the existing Movable Type database apart for Dad and Kirsten’s sites.

First steps are being taken, though. We’ll find out what path they lead me on.

iTunesWise Up! Sucker” by Pop Will Eat Itself from the album This is the Day…This is the Hour…This is This! (1989, 3:15).

What about [some other blogging tool]?

After reading my rant about comment spammers, Joel asked me if I’d thought about switching over to another weblogging system. Here’s a (somewhat expanded) copy of what I sent back.

I’ve enjoyed reading your site (and its comments) ever since TypePad… and I bring this up as an honest suggestion. Why not try out WordPress? It’s simple and while it’s not immune to comment spam there are a wealth of plug-ins and options that filter or destroy them quite nicely.

Switching systems is definitely one of the things on the “possible solutions” list (WordPress and ExpressionEngine being the two top contenders). One of the things that’s been keeping me from exploring that is a distinct lack of redirect-fu when it comes to making sure I don’t break my old permalinks. I’ve received one offer of possible assistance with that, though, so it may be less of a hassle than it’s looked in the past. In the best of all possible worlds I’d be able to keep my current permalink scheme, but I’m not sure if that’s possible with the other systems, so if I have to, I’d settle for working redirects.

Part of what keeps me on Movable Type, though, is simple customer loyalty and experience. I’ve been on MT/TypePad for years now, and it’s what I’m most familiar with. Plus, they’ve been very good to me — they even just refunded me the $120 I’d accidentally paid for a year of TypePad that I wouldn’t be using, purely out of the goodness of their heart (I didn’t even ask — they saw my post grumbling about my own absentmindedness and made the offer).

I’m also unsure about how much moving to a PHP-based system (as both WP and EE are) would impact my server. MT’s Perl codebase has high overhead when it’s working on something, but then very low overhead when it’s simply serving static pages. Thanks to that, until the spam attacks started getting this bad, it played very nicely on my system. Since PHP has to process every page as it goes out, that’s more overall processing, and the question becomes whether PHP is resource-friendly enough on my box to be worth the switch. I’d used MT’s new PHP integration to dynamically generate pages for a while (before I decided that I wanted to integrate plugins that didn’t play nicely with the PHP code), and there was a noticeable lag when first requesting a page. More info on this aspect from any current WP or EE users (or even developers) would certainly be appreciated.

No matter what, though, I’m not going to be up and disappearing. I’m frustrated and annoyed by the whole situation (though not as much as I was yesterday), sure…but I’m not that easy to shut up, either. ;)

Oh, one other thing: if I do move to another system, I want to be able to use tags instead of categories. I know that there’s a plugin for this for Expression Engine (John‘s using it), and it appears that there is a hack for WordPress also (though that’s from a few months ago). Something else for me to investigate while I’m deciding which direction to head.

Update: I’ve had one vote against going to a dynamic system such as WP or EE. Phil (who I host) has both a WP and an MT weblog set up on my server. To compare the two, click these links and compare how long they take to load: MT (serving static pages) and WP (serving dynamic pages). It’s a noticeable difference, the MT site pops right up, while you can watch the WP site build the page. Off of that example, at least, I’m thinking sticking with MT and static pages is a good idea.

Update: Whee — I’m still getting comments, they’re just “old-school” e-mail comments. :) This is good. Both indieb0i and Ryan (and Gregor) have let me know about the Staticize plugin for WordPress, which “is a highly advanced caching engine that dynamically and automatically caches pages on your site that need to be cached, when they need to be cached.” Essentially, only the parts of the page that really need to be dynamically generated are, and the rest of the page is static (at least, that’s how I’m reading it). Nice, and puts WP back in the possibilities list. Thanks!

The Spammers Have Won (for now)

Until I have time to get in and do some rather major work on my webserver, I’m afraid that comments and TrackBacks are turned off. I really don’t like doing this — I like the interaction aspect, both getting into discussions and just knowing that people stop by here from time to time — but the attacks on the server have been too severe and too regular, and I’m tired of battling them.

I’m pretty sure that there have been three major things causing my problems.

  1. My server is just too old and slow to handle the attacks.

    Rather than paying for hosting space somewhere, I run my own webserver out of my apartment. This has quite a few advantages, in that I don’t have to worry about how much disk space I use, there are no bandwidth caps, and it’s allowed me to host websites for friends and family on the same server. However, the downside is that the server itself isn’t terribly powerful by today’s standards — only a single-processor 350Mhz G3.

    Now, really, that’s not that bad of a machine, and for general purposes — that is, serving static pages, which is what I started with years ago — it works wonderfully well. However, when I’m in the midst of getting hit by a spam attack, it just can’t handle the load, and it slows to the point of a virtual crawl. It’s never actually gone down — right now it’s showing a reported uptime of 197 days, 17 hours, and one minute — but there’s so much for it to process that it might as well go down.

    The issue is that comment attacks these days take the form of an automated script, or ‘bot’, that repeatedly and rapidly submits comments to the comment script on a weblog, sometimes hundreds of submissions per minute. While I have anti-spam measures such as MT-Blacklist installed, they still need to look at each submitted comment in order to determine whether it’s spam (and reject it), an actual user-submitted comment (and accept it), or something indeterminate (at which point it’s put into a moderation queue for me to look at).

    When I’m getting flooded with hundreds of comment submissions at a time, though, my server just can’t process the information fast enough to be able to respond, and my server essentially stops responding until it can work its way through everything.

  2. Renaming the comment script is pointless.

    One of the accepted methods of combatting the spam attacks is to rename the script that MT uses to accept and process comments, on the theory that the ‘bots’ that the spammers use then won’t be able to submit anything. This used to work, but now it’s painfully obvious that the spammers have upgraded their bots to parse through the HTML code of a page to find the name of the comment script. At this point, I can rename my comment script, and the attacks start again within a minute or two after I rebuild my site. So much for that idea.

  3. I made a mistake a while back that’s now biting me in the ass.

    The last time I set up my server, I made what in retrospect was obviously a mistake, though I didn’t think about it at the time. Each of the three primary accounts on my server — me, my dad, and Kirsten — use the same MySQL database for their MT data. Because of this, whenever a comment spam attack starts, it doesn’t matter which domain they’re aiming at — as the bot generally attacks by submitting a few comments to one entry ID number, then increments that by one and sends a few more comments, as it steps through entry IDs on the database it will end up hitting entries on every weblog in the database. A single comment attack on any single domain on my box can affect all three domains.

    Okay, yes, in retrospect, that was fairly amazingly dumb on my part. Of course, six months ago the comment spam attacks weren’t anywhere near the level that they are today, so it’s taken a while for this mistake to start showing the consequences. Things like this, however, are a big reason why I only provide hosting services for a few select friends and family, and I make sure they know that there may be occasional issues: as a sysadmin, I’m essentially learning as I go, which isn’t always the safest or most effective way to go about it. Kind of the webmaster’s version of driving by braille.

What I need to do now, then, is break everything down and start over. Luckily, I shouldn’t have to do a full nuke and pave on my server — just the MT systems. I need to do a complete export of all entries and comments for each weblog on the system, nuke the MySQL database that MT is using, then create three separate databases, reinstall MT, and re-import the weblogs. Not a fun process, but I think I should be able to do it fairly transparently, without losing all the various design tweaks and customizations we’ve made to the weblogs. It may result in anywhere from a few hours to a few days of downtime for the sites I host, but I’ll do my best to keep that to a minimum once I start.

Once I’ve done that, I’ll experiment with turning comments back on. I’m not entirely sure how that will go, as the spammers will still be able to attack, but at least at that point they’ll be limited to attacking one domain at a time instead of attacking one and getting two more in the process. This may or may not be enough to keep comments open…we’ll find out when I get to that point.

This has been a rough couple of days, and yesterday I skirted dangerously close to just pulling the plug on my server entirely. I started hosting my own websites back in 1995 because it was fun to do, and the project has grown over the years, always because I enjoyed it, and it’s fun to find all these neat new things that can be done. Installing MovableType, opening up comments to the world, hosting sites for Kirsten, Phil, and my dad — I love the fact that I can do this.

But these spam attacks have been taking all the fun out of it. Each time I see the server get hit and stop responding it gets more and more frustrating. Yesterday I was ready to just completely throw in the towel — at one point, even checking to see if it would be possible to import all my old entries into my LiveJournal account (it isn’t). Thankfully, after a couple hours of Prairie and Phil putting up with my whining and tossing ideas at me over IM, I just figured that even though I don’t like to do it, at this point simply turning off comments until I have a chance to rebuild the database and the MT installation was the best way to go.

So that’s where things stand at the moment. Feedback is still a good thing, so feel free to drop an e-mail my way if there’s something you’d like to toss my direction. Until I get the chance to spend a few hours/days doing maintenance on the box, though, this is how things stand.

iTunesSweet Home Chicago” by Blues Brothers, The from the album Blues Brothers, The (1980, 7:51).

Network Outage

One of the reasons I like Speakeasy — my ‘net connection just went down (and is still down as I type this, so nobody’s going to see this post until the issue is fixed). I called Speakeasy’s tech support, and got this automated message:

Thanks for calling Speakeasy. Some of our broadband customers in the greater Seattle area are currently reporting a network outage due to a vendor failure. We hope to have this resolved within 30 minutes.

(pause)

(big sigh)

If we’re lucky.

I can respect honesty like that.

Things seem to be up now, though (at least, DNS services are back, so websites are accessible again, though iChat can’t connect to the AIM network), so it was only about a ten minute outage. All in all, just a minor annoyance. These things happen.

iTunesBongo Tune” by Quarter from the album Essential Chillout (2000, 5:52).

Comments/TrackBack down until further notice

Dammit.

Comments and TrackBack pings are currently disabled at the server level back online for all sites I host (www.michaelhanscom.com, www.hanscomfamily.com, www.geekmuffin.com*). As I’ve done this at the server level, this is not reflected in the sites themselves: they all still look like they accept comments, but they won’t work.

I hope to be able to get them turned back on soon.

This may or may not be realistic. Much as I’d hate to have to turn them off permanently, unless I can find an effective block against the attacks that continue to cripple my server, it’s starting to look like a definite possibility.

This sucks.

Update: Okay, it’s all back up and running. One new software tweak, and another rename to the scripts.

I think I need to figure out a shell script that will rename the comment and trackback scripts, update the mt.cfg file with the new info, and then rebuild the sites on a weekly basis. Which wouldn’t be fun, but I really am running out of ideas short of entirely disabling comments and trackbacks or moving to another weblogging system, neither of which are very high on my list of things to do.

* Actually, www.geekmuffin.com will be ‘broken’ until a full rebuild is done. Unfortunately, as I don’t have rebuild rights for Kirsten’s site, she’ll need to do that on her own when she gets a moment. :)

iTunesBreathe” by Depeche Mode from the album Exciter (2001, 5:17).

No more combined feeds

While I’d been considering this for a little while, Dave’s ‘Information Aversion’ post prompted me to un-splice my Flickr photos from my RSS feeds. Having done that, I’ve updated my feeds page to list my current available syndication feeds, all broken out to allow readers to subscribe to as much or as little of my drivel as they please.

I now offer six different syndication feeds. The first three are various ways of getting actual weblog posts:

  • Excerpts Only: The lightest feed available, this will only deliver a short excerpt for each post. You’ll have to decide if you want to click through to my page to read the full post or not.

  • Full Posts: This is the default RSS feed for this site. The full front-page text of each post (extended entries are not included).

  • Full Posts with comments: This is the most information-rich feed. The full front-page text of each post is included (extended entries are not included), along with any comments made to that post. Entries will update in your RSS reader as new comments are added, until the post scrolls off the front page of my site.

The second three contain various extra information: comments to current active conversations on the weblog, interesting links I run across, and my photography.

All feeds are run through the Feedburner service in order to assure maximum compatibility and usability. Each feed will automatically optimize itself according to which aggregator requests it, and if anyone actually clicks on any of the feeds in a browser, rather than getting a page full of gobbledygook, they’ll get a nicely formatted page explaining what they’re seeing and providing them with a full complement of buttons to assist in subscribing them to whichever news aggregator they favor (try it out, it’s rather nifty — unless you use Safari, where this doesn’t seem to work…bummer).

(If you already subscribe to my del.icio.us or Flickr feeds directly through the respective services, there’s no real need to switch to using the Feedburner feed link — you’ll get the same information either way. Of course, if you do use the Feedburner feed link for those feeds, I’ll get more accurate statistics as to how many people are reading which RSS feeds, which makes me happy. Whatever works for you, though.)

iTunesLunatics Have Taken Over the Asylum, The” by Collide from the album Vortex (2004, 5:34).

All Request Saturday

Here’s an interesting idea, stolen from Terrance, who stole it from Stay of Execution: an all-request day.

Something about me you’d like to know? Something you’d like me to ramble on about? Pick a topic, any topic, and drop it in the comments. Come Saturday, I’ll go through what (if anything) is there and start babbling.

Of course, if nothing appears, I still reserve the right to go on about whatever I damn well please, so don’t think that by not suggesting anything you’re any more likely to get me to shut up. :)

iTunesSituation (The English Breakfast)” by Yaz from the album Don’t Go/Situation (1999, 9:04).

Battling the spammers

Over the past few days, I’ve noticed off and on that my webserver has been extremely slow to respond — less obviously when just browsing pages, but attempting to connect to the Movable Type interface was increasingly difficult, often resulting in nothing but timeouts and connection failures.

I had a hunch that I knew what was going on, but I wasn’t entirely sure at first. I logged in to the server locally — something I haven’t had to do in a while — and realized just how badly the machine was bogged down when the OS X user interface was almost as unresponsive as Movable Type. Not a good sign. Once I made it in and got a terminal window up, I ran top -u 15 to see what was going on.

Not surprisingly, every entry that top displayed was a perl process, with mysqld occasionally clawing its way to the top for a moment or two. Now I was almost entirely sure that one or more of the sites I host was under a major automated comment spam attack, as even with MT-Blacklist installed and refusing the majority of the submitted comments, it would require a certain amount of processing for each request, and while I’m not sure just how many a minute were being submitted, it was obviously enough to bring my server to its knees.

So, seeing if I could kill two birds with one stone, I renamed all the comment and trackback scripts on the webserver, figuring that this would kill any in-progress attack and in doing so, confirm that it was a spam attack. Sure enough, as the multitudes of perl processes slowly worked their way through to completion, top started running faster (it had been updating once every 6-10 seconds, rather than once a second) and other processes started to show up on the display. After about two minutes, there wasn’t a single perl process on top‘s list, top was updating at its standard once-per-second frequency, and the computer’s UI was responding as it should.

The downside to this technique is that it breaks comment and trackback ability. Easy enough to fix, though, with a quick change to MT’s config file and a rebuild of the sites. So, the comment scripts have been renamed, and I’m in the process of rebuilding the sites to reflect the new script locations.

And you know what?

Even in mid-rebuild, I’m already starting to watch the number of perl process climb. One or two I’d expect while rebuilding the site, but I’m currently seeing anywhere from two to ten at a time. I’ve got a really bad feeling that whatever spammer has me targeted has a script smart enough to scrape the pages to find the script locations, no matter what they are named.

This — in a word — sucks. Outside of turning comments off entirely for the targeted sites, which really doesn’t thrill me, I’m not sure where to go next.

Guess for now I’ll just have to keep an eye on things and see how they go.