Blog, Thoughts and Portfolio of Joe Angus

Latest

What are the problems with comment spam? Why do we even get it?Oct 18th category: thoughts | tags: spam comments captcha bots

Litually days after putting my blog/portfolio live, I already got hit with a mass amount of spam all on one particular entry. For those of you that don’t know what spam is basically it’s automatically posting random comments or promoting commercial services to blogs, wikis, guestbooks, or other publicly accessible online discussion boards. Any web application that accepts and displays hyper links submitted by visitors may be a target (Yes, I did just steal that sentence from wikipedia).

I’ve built this website on a very well known CMS called Expression Engine, and although they do have built in features to reduce and block online bots over time they seem to have been overcome and they are well in need of updating. So why do people do this you say? Do they really think someone will click that famous link of increasingly their manly-hood ? Or maybe it’s purely for annoyance…

Luckily there are a few options available to blog authors although not all of them are a perfect fit. They come in different strengths and all have different outcomes…

The Mild ‘meh’

  • The first and most easy option is installing or using captcha software. Expression Engine does come with one built in, but it really doesn’t catch even 80% of the current day spammers out there.
  • Secondly you can manually just monitor your comments and let them be live unless you say so, and decide yourself whether they are spam are not, if anyone see’s them first then it’s just tough luck for your readers until you get home and get a chance to moderate.
  • You can also disallow HTML or just links in your posts, which can dramatically reduce a bots interest to even bother posting.

The Medium ‘hmm’

There are multiple different types of captcha avaliable to use, and taken the time to install them and set them up correctly it can make life a little easier and reduce the amount of spam coming through. Here are a few options;

  • reCaptcha reCaptcha is slightly different than usual as it has found a way to battle OCR, by using images of words that aren’t recognizable by a computer at all. The downside is that to check what you have typed in is infact what is being displayed on the screen, it needs to check it digitally which obviously is impossible in this case. So, reCaptcha offers 2 words - one which is taken from an old book that can’t be read, and one that can be and it guesses how accurate your non-digital word is by how good your digital guess has been, and what other people have said that word is. This does infact work, although my major complaint is it looks big and ugly! maybe a poor excuse, but I prefer not to have a big block like that being displayed on most of my pages.
  • Askismet “When a new comment, trackback, or pingback comes to your blog it is submitted to the Akismet web service which runs hundreds of tests on the comment and returns a thumbs up or thumbs down” and it does exactly that! Everytime someone posts a comment, the text is sent to Askismet to see if it thinks its spam or not. This is a great way to dramatically reduce most of the spam comments you may recieve, and from experience extremely easy to install - definitely an editors choice.
  • There are also many other different captcha’s out there, some of them require you to write a set of questions and then the answers to them, relying on the fact that only a human would be able to answer a question. I do like this idea, but I don’t think it’s very a multi-lingual answer.

The Strict ‘grr’

  • If you want to cut them out completely, then you need to apply some strict counter measures which in most cases will make it difficult for your users to accessibly use your website and successfully post comments without the annoyance created turning them away from posting.
  • One of the quickest and effective ways is making sure that comments are only posted by members, that will put a much bigger layer of protection in as the bot will have to attempt to register themselves before making the post. I’d greatly suggest that you create your own custom registration field that’s mandatory and a little abstract. That way it makes it harder for anything but human to register.
  • Next is a tiresome although will cut out all comments you may not want, and that’s by making all comments run through a moderation phase. This means that every comment will need to be authenticated by you before they will be displayed on the website - although this does work it does mean you will need to check everyday, and probably a good few times a day otherwise your users will be tired of their comments taking 24hours+ to actually get published and will more than likely deter them from posting at all!
  • EE and also other services offer lists of IP addresses that you can completely ban from your website, this can be great but it’s also possible you may end up blocking an innocent and loose a genuine visit to your site!

That’s my quick few thoughts about spam and some answers, I hope that it helps someone decide upon what to do about their spam! There’s many more ways of countering spam, I know that I havn’t covered them all but these are the quickest methods I believe are easily available to most web authors.

Add your comment on this thoughts post

  Remember me on this computer?
  Notify me of follow-up comments?

Other Thoughts

Tag cloud



Home | Blog | Thoughts | Portfolio | About | Contact