23.13.20 - Mark
Today's random web poll gadget meme thingamabob, which really looks like an attempt to googlebomb some payday loan site (whose link I removed before posting this). Remember, it's not technically spam if the blog's author posts it...
Keywords, Links, and the Kitchen Sink
16.38.32 - Mark
The comment spam problem on this blog has finally gotten to me (the database powering this thing has 60MBs of plain text spam comments!) and I'm now in the middle to testing a couple of new tools in my little war on spam. The main reason I've put it off this long is because I though it would take an entire overhaul of the comment system to even attempt to cut back the crap comments, but thankfully I was wrong.
When I started Googling for spam filtering tools, I quickly found two existing services. One called LinkSleeve, which basically looks at the links in the submitted data and compares them to its existing database. The second is Akismet, which seems to be the intimidating sentry in the field.
As it turns out neither was that hard to install into my existing system, LinkSleeve was literally cut and paste, with no modifications needed at all, while Akismet was a little more hands on, involving registering with wordpress for a free account, then researching ways to connect my code with their services. While I was able to find the rights material, it involved some programming on my part, adding a couple of calls and changing some variables around.
Since early this morning, all of the comments on this site have been evaluated with my own filtering rules, along with LinkSleeve's URL screening and Akismet's blend of filters, and the results are a bit surprising. When I added LinkSleeve I though it had the best solution, since comment spam is all about the links, I though that it would catch junk comments my filters were missing (The Hey! Cool Site. Comments that are hardest to screen), but not only does it miss most of them, it also fails to catch spam comments with a dozen obvious junk links. This may be due to a lack of users sending comments into the system, but right now its even far behind my rudimentary keyword/ip based filters.
Of course once I had Akismet set up, it blew away my existing tools, capturing the vast majority of the spam comments that have trickled in since its installation. That's not to say its perfect, but I think its safe to call it as being somewhere around 90% right now
There will however be more spam comments here for the next couple of days. While I think Akismet will be my primary tool for stopping spam, I'm probably going to continue using all three systems to catch spammers in the act, and set up a master script to direct spam to various levels of purgatory based on which filters it trips. There are going to be a few other upgrades (in addition to a significantly cleaned up database) to my little system, but I feel so much better having found a better way of dealing with the spam around here.
02.44.00 - Mark
After letting my blog sit in its own juices for the last few months while I was at camp, it managed to get hit by comment spam pretty hard - one post had something like 327 spam comments that snuck past my (admittedly crude) filters.
The total ammount of spam I've recieved this year was close to 10MBs of data! Just for comparison all of my 1500 some blog posts takes up about 1.4MBs - 1/7th the space!
That's not to say it's all been deleted. Because I was in there doing some heavy duty cleaning I shifted some structures around to make it more manageable and I'll need to fix some of the related scripts to match it - so don't worry too much if commenting is broken over the next day or two.
On the other hand there's no doubt in my mind that spam is a serious problem, even for small bloggers using homebrew software.
09.27.57 - Mark
Late last week I had about a dozen junk posting find their way into my blog's database. They wern't comment spam, although at first glance they looked like it, with junk email addresses and the poorly spelled messages characteristic all spam seems to contain. What was a bit atypical was the spammer's address which was at my domain. It didn't hit me why this was until this morning when I had another one of these messages pop up.
I was being used to help sleezeballs in Latin America spam some poor fool's email account.
Ooops. My Bad.
The quick patch was a series of rules you need to meet before a comment is posted, and when I get around to it I'll probably put together some IP filters and email verification code as a basic spam filtering system, and then move it over to another "installation" of my blog software before spammers discover it in 3 months.
Other than the measures I can take, I kind of feel bad for the dozen or so people who have been spammed because of an exploited error in my code...
14.53.53 - Mark
Shoot. I was partially hoping that my not using a standard blog engine would keep the spammers at bay. Maybe that's true of some scripts, but not all. Two months without spam isn't bad I suppose, considering how high I've been placed in Google anymore
Looks like I'll need to work out some way to manage spam. Probably needed to do that anyways, considering I'm recycling my blog engine's code base on a couple other sites.