Semalt have annoyed many webmasters by skewing analytics as their crawler turns up as a referrer. To add insult to injury, the Semalt crawler does not honour robots.txt nor does using the Semalt removal tool do anything more than invite further bot visits.
This pdf outlines 3 ways you can stop Semalt from polluting your analytics:
1. Block the Semalt crawlers via htaccess
2. Block Semalt crawlers via php
3. Remove Semalt referrals via Google Analytics
How to Block Semalt Crawler and Clean Up Your Analytics
1. Semalt Referral Spam And Small Business Websites
Original post http://www.mylocalbusinessonline.co.uk/semalt-referral-spam-shady-tactics
My clients are small businesses often targetting a specific local area. More often than not, they are
trying to get their head around this “online stuff” and want to make it work for them.
One of the many things I’ll recommend is regular visits to their analytics. Life is much easier when
you know where your traffic is coming from, what is working and what is not.
A local business owner wears many hats. Web marketing is a very small one and is often limited on
time and budget. Checking your analytics can help prioritise which social sites to spend time on,
any ads are bringing traffic and the time and/or money spent is actually bringing a return.
Enter Semalt – showing up as a referrer in analytics. Not once in a blue moon either…
Who are Semalt?
Semalt claim to be a
…professional webmaster analytics tool that opens the door to new opportunities for
the market monitoring, yours and your competitors’positions tracking and
comprehensible analytics business information.
(Their words, no I’m not linking to their page)
For a “professional webmaster analytics tool” one has to wonder why they think it is acceptable to
send their crawler as a referrer and not a standard bot visit.
They allegedly understand webmasters frustration at them totally screwing up your analytics with
their referrer spam and invite you to remove your website from the seed list.
DON’T REQUEST TO REMOVE YOUR WEBSITE VIA SEMALT
I’ll show you why you shouldn’t use Semalt’s removal request in a moment.
Of course, to find where their removal request is you have to do some rummaging around the web.
The first you’ll hear of it is via comments on so many blogs complaining about Semalt or may be
on Twitter. They have a person that just goes around Tweeting and commenting trying to ease
people’s concerns. Semalt’s homepage gives you nothing but a sign up form. No links to usual
pages like contact, services, privacy or anything else.
Being the sweet and innocent person I am (stop laughing), I removed
MyLocalBusinessOnline.co.uk from their seed list some months back in good faith. They did
honour it.
I no longer receive visits from the main Semalt crawler.
All was quiet for a week or two and then bombarded with visits from various semalt subdomains,
kambasoft and savetubevideo.
The client who I was talking to earlier today did her own check on Semalt and also put in a removal
request for her website a month or so ago. She logged into analytics today to this…
2. Google Analytics screenshot – this is just a few of the Semalt referrals to a small local website in 1
month!
73% of her referral traffic (38% of total traffic) is from Semalt and friends.
I have no idea what Semalt’s game is, but those numbers certainly are NOT honouring a removal
request. Maybe in the Ukraine (where Semalt are based) “Remove” actually means, “Come, bring
your friends! We have cake!”
What is a web crawler?
A web crawler is an automated bot that systematically crawls the World Wide Web. Search engines
use them to index your web pages so they can efficiently serve their search results. There are other
uses, both good and bad for crawlers. Most of the time you wouldn’t be aware of their visits to your
site.
Semalt claim that their crawler is no different than Google, Bing or Yahoo crawling your website.
None of the major search crawlers come in as a referrer. Their bots (and others like them) pop
along in the backgound and you wouldn’t see them screwing up your analytics pretending to be real
visits.
The screenshot above is to a local website for a small business based in a village in south England.
It’s rarely updated. Google, Bing, Yahoo and others do pop along regularly but not several times
per day. There’s simply no need to.
The long and short of the matter is Semalt’s crawlers do not act like legit web crawlers.
What are Semalt up to?
At first glance, this appears to be nothing more than a shady marketing technique to get curious
webmasters to visit their site. Go to any of the referral URLs for Semalt and you land on their main
web page that currently invites you to try their software free for 7 days.
Look a bit closer, visit the Kambasoft referrals and you are redirected to random websites – perhaps
it’s just a way of driving traffic? Useless, untargetted traffic perhaps, but I am sure someone
somewhere is fooled by big numbers.
Do more research and it gets a bit scary…
3. It would appear that Semalt are involved in more than shady referral tactics, going as far as actually
infecting people’s computers with trojans to build their web of spambots. Read more about that
side of things at nabble.
How to block Semalt and friends
3 Ways To Block Semalt And Clean Up Your Analytics
Semalt doesn’t appear to honour robot.txt. They also have so many IP addresses, blocking by IP is
impractical. If you really want to try that route, you can find a list of IP addresses associated with
Semalt here.
There are some alternative steps you can take.
4. 1. Edit your .htaccess file (if you have one and have access)
Since Semalt ignores robot.txt, you can block it’s crawler using your .htaccess file. This file is very
powerful and can break your website – so if you’re unsure then leave it be. The semalt crawlers
themselves don’t appear to be malicious at the moment, just a pain in the backside screwing stats
and using resources (this can become a problem though!)
Add the following code to your .htaccess file
# Block fake traffic
RewriteEngine on
Options +FollowSymlinks
# Block all http and https referrals from "savetubevideo.com" and
all subdomains of "savetubevideo.com"
RewriteCond %{HTTP_REFERER} ^https?://([^.]
+.)*savetubevideo.com [NC,OR]
# Block all http and https referrals from "srecorder.com" and all
subdomains of "srecorder.com"
RewriteCond %{HTTP_REFERER} ^https?://([^.]+.)*srecorder.com
[NC,OR]
# Block all http and https referrals from semalt.com" and all
subdomains of "semalt.com"
RewriteCond %{HTTP_REFERER} ^https?://([^.]+.)*semalt.com
[NC,OR]
# Block all http and https referrals from "kambasoft.com" and all
subdomains of "kambasoft.com"
RewriteCond %{HTTP_REFERER} ^https?://([^.]+.)*kambasoft.com
[NC]
RewriteRule ^(.*)$ http://semalt.com/ [L]
The code basically says if a crawler comes in from any of the Semalt sites then turn it around and
send it back to Semalt. They can have their spam back, thank you.
If you’re not as annoyed as me and don’t feel comfortable sending their bot back to them, you can
replace the last line with
RewriteRule .* - [F]
Many thanks to Michael Martinez for the code on their post over at Marketing Pilgrim, “Tips for
Blocking Semalt and Botnet Attacks“.
The original bit of code I was using is found at logorrhoea.net.
# block visitors referred from semalt.com
RewriteEngine on
RewriteCond %{HTTP_REFERER} semalt.com [NC]
RewriteRule .* - [F]
This snippet only blocks Semalt, not subdomains or their friends at Kambasoft et. al. Instead of
redirecting the bot back to Semalt, it simply denies access. It worked perfectly well until Semalt
started adding more and more sources. Since I reached 100′s of referrers between Semalt and
5. Kambasoft, it was getting rather silly. The first code above is a cleaner way of doing it.
You can add each referrer as you see one come in.
# block visitors referred from semalt.com
RewriteEngine on
RewriteCond %{HTTP_REFERER} semalt.com [NC,OR]
RewriteCond %{HTTP_REFERER} semalt.semalt.com [NC]
RewriteRule .* - [F]
WordPress recommend the following code added to .htaccess.
SetEnvIfNoCase Referer semalt.com spammer=yes
Order allow,deny
Allow from all
Deny from env=spammer
Again, this will only block the original Semalt bot and you’ll need to add each referrer as you see it.
ALWAYS take a copy of your original .htaccess file so you can change back if anything does go
pear shaped. I did not write any of this code (I’m not that clever!) I make no guarentees of
suitability, fit for purpose or anything else.
2. Blocking via PHP
The nabble guys mentioned above who tracked down Semalt using malware added an update to
their post:
Update / August 8 — We’ve created a simple PHP package to block referrer spammers
such as Semalt from visiting your site: https://github.com/nabble/semalt-blocker
Far too technical for me – but may be useful to people running PHP based websites (or more likely,
their developers!)
3. Remove Semalt from showing in Google Analytics
• log into Google Analytics and select your website
• click on ADMIN in the top menu bar
• in the central PROPERTY column, click js TRACKING INFO then REFERRAL
EXCLUSION LIST
• click the red ADD REFERRAL EXCLUSION button
• enter the referrer URL in the box and click the blue CREATE button
You will need to add each referrer individually. Using this method won’t block Semalt but will stop
them showing as a referrer in Google Analytics.
Conclusion
Semalt are obviously not a legit company. Regardless of their tactics to get people to sign up for
their service, a legit business would honour requests not to crawl.
6. The mess in analytics is not helpful particularly for small business owners who are pushed for time,
resources and knowledge to make sense of them.
Other than making a mess in analytics, the Semalt crawlers don’t appear to be malicious. Of
course, the software they push people to download may well be, so I advise staying well clear.
Original post http://www.mylocalbusinessonline.co.uk/semalt-referral-spam-shady-tactics
Did you find this useful? Please share with your network
About Jan Kearney
I believe that every business, no matter how small or how local can use the power of
the web to gain more customers. I offer no bull coaching and mentoring so small
business owners can strategically put the web to work for their business. I’ve been
called a “compass” and a “navigator” and probably a few more names that aren’t
suitable for a profile!
Connect with me on Google+, Facebook, Pinterest or follow along at the My Local Business
Online blog