First of all, what is a “Trackback” I hear you ask. Many observant people will have noticed such a phrase lurking near the bottom of the many blog entries they read as part of their daily blogging ritual. But in truth, the popularity of the Trackback system is really starting to falter. The reason for this is the large increase in “Trackback Spam” such as being complained about across the blogsphere.
Trackback Spam is an annoying, yet fairly effective way for spammers to increase the number of backlinks to their useless spam websites. I have a Trackback system installed on this Junto.co.uk blog, and my methodology has been successfully fending off spammers, who have been trying to spam my website for months. The question is – how do I do it? The basic logic goes something like this:
If the trackback is not providing increased Page Rank then it is less attractive to spam using the Trackback system. Quietly back in January of 2005, Google made an announcement which essentially should have killed the trackback spammers for good. What they proposed was quite simple. The solution was for bloggers to tag all trackback backlinks with the attribute of rel=”nofollow”. This small attribute essentially killed trackback spam in one blow. From now on, spamming trackbacks wouldn’t increase the spammers’ page rank. MSN, Yahoo as well as the core players in the Blogging software market (such as Six Apart) jumped on the idea and signed up wholeheartedly. This has to be one of the first times I think I have ever seen such co-operation between such mighty foes. However, the story doesn’t end there. The spammed trackbacks are still being listed on blogs and really still provide links to the spam websites that users could follow. The next step I implemented kills that too.
The idea behind trackback is that you write something about a blog entry on another site, link to it and then the software sends a trackback ping to say “hey, we commented on something you wrote, and this is what we said”. It puts a push technology in place to let other bloggers know what is being said about them and that is truly a useful thing.
The key to this process is therefore that a link should exist on the page which made the trackback ping. If it isn’t there, then the ping is false and can be denied. I am yet to see a spammer who has actually bothered to link to me. If they did, then their trackback would be accepted, but only under the conditions laid out previously.
So how do you write a Trackback system that checks to see if the trackback ping is genuine? With a bit of scripting it isn’t that hard to achieve in reality. In essence you make a HttpRequest to get the page that is claiming to link to you, otherwise known as page scraping. Then you check whether your link is there in the page or not. If it exists then you can add it as a valid trackback.
If you really want to prevent the spam ever showing up, the best way to do so is to have an editorial process before the link goes live. I do this for comments, and if I don’t like the comments, they don’t go live, and trackbacks can be dealt with in the same way.
Finally you should still add the rel=”nofollow” to the trackback link you present under your blog entry. Trackbacks aren’t about PageRank – they are about community. The sooner spammers realise this, the better.
I’m starting to collect spam trackbacks that attack my website in a database. I plan to publish them and pass them onto the major search engines. Hopefully the big players like Google will remove those spamming websites from the very search engines they are trying to exploit. Amen.