What are the problems with comment spam? Why do we even get it?Oct 18th category: thoughts | tags: spam comments captcha bots
Litually days after putting my blog/portfolio live, I already got hit with a mass amount of spam all on one particular entry. For those of you that don’t know what spam is basically it’s automatically posting random comments or promoting commercial services to blogs, wikis, guestbooks, or other publicly accessible online discussion boards. Any web application that accepts and displays hyper links submitted by visitors may be a target (Yes, I did just steal that sentence from wikipedia).
I’ve built this website on a very well known CMS called Expression Engine, and although they do have built in features to reduce and block online bots over time they seem to have been overcome and they are well in need of updating. So why do people do this you say? Do they really think someone will click that famous link of increasingly their manly-hood ? Or maybe it’s purely for annoyance…
Luckily there are a few options available to blog authors although not all of them are a perfect fit. They come in different strengths and all have different outcomes…
The Mild ‘meh’
- The first and most easy option is installing or using captcha software. Expression Engine does come with one built in, but it really doesn’t catch even 80% of the current day spammers out there.
- Secondly you can manually just monitor your comments and let them be live unless you say so, and decide yourself whether they are spam are not, if anyone see’s them first then it’s just tough luck for your readers until you get home and get a chance to moderate.
- You can also disallow HTML or just links in your posts, which can dramatically reduce a bots interest to even bother posting.
The Medium ‘hmm’
There are multiple different types of captcha avaliable to use, and taken the time to install them and set them up correctly it can make life a little easier and reduce the amount of spam coming through. Here are a few options;
- reCaptcha reCaptcha is slightly different than usual as it has found a way to battle OCR, by using images of words that aren’t recognizable by a computer at all. The downside is that to check what you have typed in is infact what is being displayed on the screen, it needs to check it digitally which obviously is impossible in this case. So, reCaptcha offers 2 words - one which is taken from an old book that can’t be read, and one that can be and it guesses how accurate your non-digital word is by how good your digital guess has been, and what other people have said that word is. This does infact work, although my major complaint is it looks big and ugly! maybe a poor excuse, but I prefer not to have a big block like that being displayed on most of my pages.
- Askismet “When a new comment, trackback, or pingback comes to your blog it is submitted to the Akismet web service which runs hundreds of tests on the comment and returns a thumbs up or thumbs down” and it does exactly that! Everytime someone posts a comment, the text is sent to Askismet to see if it thinks its spam or not. This is a great way to dramatically reduce most of the spam comments you may recieve, and from experience extremely easy to install - definitely an editors choice.
- There are also many other different captcha’s out there, some of them require you to write a set of questions and then the answers to them, relying on the fact that only a human would be able to answer a question. I do like this idea, but I don’t think it’s very a multi-lingual answer.
The Strict ‘grr’
- If you want to cut them out completely, then you need to apply some strict counter measures which in most cases will make it difficult for your users to accessibly use your website and successfully post comments without the annoyance created turning them away from posting.
- One of the quickest and effective ways is making sure that comments are only posted by members, that will put a much bigger layer of protection in as the bot will have to attempt to register themselves before making the post. I’d greatly suggest that you create your own custom registration field that’s mandatory and a little abstract. That way it makes it harder for anything but human to register.
- Next is a tiresome although will cut out all comments you may not want, and that’s by making all comments run through a moderation phase. This means that every comment will need to be authenticated by you before they will be displayed on the website - although this does work it does mean you will need to check everyday, and probably a good few times a day otherwise your users will be tired of their comments taking 24hours+ to actually get published and will more than likely deter them from posting at all!
- EE and also other services offer lists of IP addresses that you can completely ban from your website, this can be great but it’s also possible you may end up blocking an innocent and loose a genuine visit to your site!
That’s my quick few thoughts about spam and some answers, I hope that it helps someone decide upon what to do about their spam! There’s many more ways of countering spam, I know that I havn’t covered them all but these are the quickest methods I believe are easily available to most web authors.
Other Thoughts
Tag cloud
Links to friends and sites that I read
Leezon (http://www.leezon.com)
Portfolio of my best friend Chris Gardner, an amazing photographer and also a fellow designer
Chris Tufnell (http://www.crazymac.net)
Blog of a good friend of mine that works in the games pr industry, he reviews a lot of games and shares his unbiased opinions to the world
Matt Holden (http://www.imholden.com)
Matt is an awesome pencil artist, and is moving to digital vector art more recently with the creation of his eagerly awaited comic Tamashii City!
Mazy (http://www.mazy.net)
Morten Hedegren 'mazy', is an extremely talented map designer that ive known for many years, and now works for IO Interactive in Denmark. His blog is always a good read, full of various reviews and opinons
Ryans GobLog (http://www.ryansgoblog.com/)
Ryan expresses his views on video games which makes for a good read and also does a fair bit of webby stuff too
CGTalk (http://www.cgtalk.com)
CGTalk is a huge community for extremely talented designers, ranging from all types of media medium. If you want to see how far people push digital art, here's definitely the place to go, it'll blow your mind ![]()
Mapcore (http://www.mapcore.net)
General gaming development community site, although focused upon level design and development. Lots of really nice guys on the forums, and definitely a great place to get good constructive criticism.
456 Berea St. (http://www.456bereastreet.com)
Another great resource for articles related to web development, just like above - it's a must for your RSS reader to keep up to date with latest development technology
A List Apart (http://www.alistapart.com)
A List apart is full of useful information for web designers and developers, very good articles on particular aspects which have definately answered some of my questions in the past
Addedbytes (http://www.addedbytes.com/)
Yet another great resource from a UK developer, his cheat sheets are particulary useful to have a print out of and leave on your desk!
Four-Kings (http://www.four-kings.com)
Four-Kings LTD are UK's first pro-esports team. I designed all of their clothing (pre Packard Bell sponsorship) which is worn across the globe at professional gaming events like The CPL.