I posted about Trackback Spam previously but it was brought to my attention that I didn’t explain it properly, so I am now going to attempt to rectify that.
If you are using a blogging application, like Wordpress, there is a facility called Trackback, whereby, when you are making a post in your blog, and you refer to a post someone else made in their blog, you can add in the trackback uri of their post (normally displayed at the end of their post) to your blogging software, and it will send a notification (called a trackback) to them.
When their blogging software receives this notification (Trackback), it displays the relevant part of the post in the comments section of the site.
Spammers are recently starting to post faked trackbacks directly to people’s blogging software, pretending someone has posted about one of your posts, hoping your blogging software will automatically display their spam on your site (thinking it is a legitimate comment).
The reason they do this is to get links from external sites to their sites, thereby pushing up their all-important Google Page Rank.



Using .htaccess to minimise comment and referrer spam
I have been using my .htaccess file to stop comment and referrer spam on this site and it has been surprisingly successful (so far!). How do I create a .htaccess file capable of greatly reducing comment and referrer spam?
Firstly, I use Awstats to analyse visits to my site daily and I use Spam Karma to help control comment spam. Both applications give me information on spammers visiting my site.
Awstats gives me a list of the referer sites - this list contains those sites which are trying to spam my referrer logs. I monitor those sites and as new ones appear I add them to my .htaccess list in the form:
RewriteCond %{HTTP_REFERER} \.domain\.tld [NC]
where .domain is the domain trying to spam my site (psxtreme, freakycheats, terashells, and so on) and the .tld is the top level domain the site is registered to (.com, .net, .org, .info, etc.).
So, for instance, in the case of the spammer coming from the smsportali.net domain, I have added the following line to my .htaccess code:
RewriteCond %{HTTP_REFERER} \.smsportali\.net [NC]
This will stop accesses from all subdomains of smsportali.net (spamterm.smsportali.net) to the site and the NC ensures that this rule is case insensitive.
In the case of comment spam, I have configured Spam Karma to email me every time it deletes a spam comment - this is becoming rarer and rarer as the .htaccess file becomes more and more effective. I have configured Spam Karma to include the server variables and request headers of a comment that is not approved in the email - this is one of the configuration options of this plugin.
Scanning these emails, I can see the User Agents being employed by these spammers - armed with this information, I added the following lines to my .htaccess file:
RewriteCond %{HTTP_USER_AGENT} Indy.Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Crazy\ Browser [NC]
RewriteRule .* - [F]
and this has greatly reduced the amount of comment spam coming through.
Also, Cindy alerted me to the fact that adding:
RewriteCond %{HTTP:VIA} ^.+pinappleproxy [NC]
RewriteRule .* - [F]
Will also catch a lot of the spammers.
I have a copy of my .htaccess file available for review (it is in .txt format).
NOTE:
For each set of rules in your .htaccess file, you need to finish with a RewriteRule - RewriteRule .* - [F] will give a 403 (page forbidden) to the spammers. Your last set of rules should end with RewriteRule .* - [F,L] - the L telling the RewriteEngine that this is the last line and to stop processing the rules here.
IMPORTANT WARNING:
the .htaccess file is a very unforgiving file. It has the power to make your entire site unavailable to anyone. It is strongly advised to read up on Regular Expressions and Mod_Rewrite (the Apache module which processes these commands in a .htaccess file) before creating a .htaccess file or modifying an existing one.