I get a healthy amount of referral traffic, but for a long time I have not known what this means. Social traffic clearly comes from social media pages, Organic Search traffic comes from search engines like Google, and Direct traffic is usually a combination of unreported organic search, direct typing of your URL, and bookmarks.
Then there is Referral traffic which is defined as the following:
Google’s method of reporting visits that came to your site from sources outside of its search engine. When someone clicks on a hyperlink to go to a new page on a different website, Analytics tracks the click as a referral visit to the second site.
When I reviewed my Google Analytics Acquisition data (which provides valuable information for your blog/business), I found that a large amount of traffic was as a result of referral traffic channels. Once I dug deeper, I found that these sources included ‘floating share buttons’, ‘get free social traffic’, and other obscure sources.So what does this mean exactly? Well, this is all
So what does this mean exactly? Well, this is all referrer spam.
What is Referrer Spam?
Referrer spam is fake traffic from spam bots, which mimic a referral link. In this case a spammer makes multiple requests to your website, using a fake referrer URL to a site that they want to advertise. The aim of spammers is to promote their site, to index the contents of your website, and to build links to their site so that they can improve their search engine rankings.
Additionally, as spam bots crawl through your site, they eat up bandwidth and slow your site down. Furthermore, some spam bots also identify (and sometimes attack) vulnerabilities on your site.
Now, there are 2 main types of referrer spam; Ghost referrer spam and Non-Ghost referrer spam (crawlers).
The Ugly Side of Referrer Spam
Someone somewhere directs spam to Google Analytics accounts, leaving laymen like me feeling like traffic royalty. This referrer spam distorts your Google Analytics data, making your research and analysis corrupt and redundant. This can be especially crippling for small businesses who are not receiving a lot of traffic to begin with.
At the moment Google has left us in the dark, meaning that we have to stop this referral spam ourselves. There are many ways of doing this such as using .htacess, but I don’t really understand this method, and I’m wary of making coding changes.
Thankfully, you can block referrer spam right from your Google Analytics (GA) account. Granted, the drop in traffic will be a blow to your ego, however, it should motivate you to work harder on boosting your traffic. Below I will define referrer spam, and then show you how to filter this spam effectively.
STEP 1: FILTERING GHOST REFERRER SPAM
What is ghost referral spam?
Ghost spam is spam bots that never interact with your site. These spammers send data directly to your GA servers, so that they appear as referral traffic in your reports.
Therefore you are getting referral traffic like ‘floating share buttons’ from spammers who have never accessed your site.
We won’t get into the technicalities of how these spammers achieve this, but in summary they randomly generate your GA ID (UA-XXXXXx-1), and thereafter they send you the fake referral traffic.
Because this type of spam does not pass through your website, you cannot block it through coding and plug-ins. The only way to stop it from messing with your data is to use the filter that I will provide below.
How to identify ghost spam
Ghost spam tends to be sporadic, so when you check your GA data you will find that there will be spikes of ghost spam, followed by several days of no ghost spam. Spotting ghost spam is not difficult, as the data provided is clearly false.
To identify ghost spam take the following actions:
1. Go to Reporting> Acquisition> All Channels> Referral
2. Once you are here click Secondary Dimensions> Behavior> Hostnames
3. Your screen will split into Source and Hostnames, and you can easily see which sources and hostnames look out of place (a list of some of the most common ghost spam sources is provided below).
Additionally, you will find that most of the ghost spam sources give you a ‘not set’ value when it comes to the corresponding hostnames.
P.S. Most ghost spam has a bounce rate of 0% or 100%.
How to identify valid hostnames
Before you create a filter for ghost spam, you need to identify your valid hostnames. You can do this easily by taking the following actions:
1. Go to Reporting> Audience> Technology> Network
2. Select Hostname as the Primary Dimension
3. Copy down all the valid hostnames that you see. The valid hostnames are where your GA tracking code appears, so all your valid hostnames should have a direct connection to your site. A hostname like amazon.com is not valid, as you have not put your tracking code on this site.
These are some valid hostnames that will appear:
http://www.yourdomain.com, yourdomain.com, blog.yourdomain.com, support.yourdomain.com, shoppingcart.com, translate.googleusercontent.com (this is used by visitors coming from other countries who need translation services).
Creating a customer filter for ghost spam
The following filter will take care of all your ghost spam, regardless of whether it appears in your referral, direct, or organic traffic channels. This filter is called the Valid Hostnames Filter, and it is the brainchild of Carlos Escalera at Ohow.
As I mentioned above, ghost spam uses invalid hostnames. Therefore, this filter works to include only valid hostnames in your GA report. To create the filter take the following actions:
1. Create an expression that includes all the valid hostnames that you have gathered. The expression will appear like this:
2. Go to Admin> Filters> New Filters> Create New Filter
3. Enter ‘Valid Hostnames’ as the Filter Name
4. Filter Type> Custom> Include
Filter Field> Hostname
5. Copy the valid hostnames expression that you created in point 1, and then paste it in the box marked Filter Pattern.
6. Click on verify this filter before you save it. You should see a table showing your data, before and after you apply the filters.
7. If the table looks good, then you can click on save.
P.S. If you add your GA tracking I.D. to video or ecommerce services like PayPal and YouTube, make sure that you add the relevant hostname to the filter above. This way that traffic can be recognized.
|Ghost Spam Worst Offenders|
|vitaly rules google||Get-Free-Traffic-Now.com||social-buttons.com|
|o-o-6-o-o.com / referral||hulfingtonpost.com||forum20.smailik.org|
STEP 2: FILTERING NON-GHOST REFERRER SPAM
What is non-ghost referrer spam?
Non-ghost referrer spam is also known as crawler referrer spam. While good web crawlers like Google Bots crawl your site so that they can index your content for the search engines, crawler referrer spam browses your site with different intentions e.g. getting your web property I.D. and sending traffic to their own site.
While ghost spam does not visit your site, crawlers visit your page and can do damage. Even worse is the fact that crawlers use valid domains e.g. apple.com, so that they don’t look out of place in your GA reports. But fear not, I’ll show you how to identify them in a few seconds.
How to identify non-ghost spam
Unlike ghost spam, crawlers use valid hostnames to send fake referral traffic to your site. Therefore you won’t be able to identify them using the method I gave above. Instead, here is how to spot these smart bots:
1. Go to Acquisition> All Traffic> Channels> Referrals
2. Select Secondary Dimensions> Behavior> Hostnames
3. Use the list provided below to identify which crawler spam is using a valid hostname
4. List the domains of all the crawler spam in a word/excel document
Creating a customer filter for non-ghost spam
Unlike ghost spam, crawler referrer spam visits your site. This means that you can block them through coding and plug-ins. However, below I will show you how to filter the spam out from your GA. Like the filter above, the Crawler Spam Filter comes courtesy of Carlos Escalera.
1. Go to Admin> Filters> New Filter
2. Enter ‘Crawler Spam’ as the Filter name
3. Filter Type> Custom> Exclude
Filter Field> Campaign Source
4. Create a regular expression, which includes the domains of the crawler spam that you have already identified. The format is the same, and it should look something like this:
5. Insert this regular expression in the Filter Pattern box
6. Verify the filter, then click save.
|Crawler Spam Worst Offenders
STEP 3: BLOCKING BOTS AND SPIDERS
These bots and spiders are not harmful, as they help with your search engine rankings. Instead of blocking them, you can exclude them by following the easy steps below:
Go to Admin> View Settings> Check box marked ‘Exclude all hits from known bots and spiders’> Save
EXTRA TIPS FOR FILTERING REFERRER SPAM
1. Filters permanently alter your data, so before you create any filters create a new GA view. This way you can have GA data that is completely unfiltered. You can do this by simply going to Admin> View Setting> Copy View.
2. Once you identify ghost spam in your Google Analytics report, do not try and visit these sites . While some spam bots are harmless, many of them are looking to intentionally install malware on your computer.
3. Spam bots target weak and vulnerable sites more often than they do protected sites. You should therefore look at investing in quality hosting, to reduce the frequency of these attacks.
4. Check your GA report for new referrer spam on a monthly basis, and then update your filters accordingly. Depending on the size of your site, you can choose to ignore the spam bots sending negligible traffic, and block the ones that are sending large amounts of fake traffic to your site.
5. Use a firewall for your site; this acts as the first line of defense for bad spam bots.
6. In my previous post I talked about using custom alerts to monitor unusual shifts in traffic patterns. The purpose of custom alerts is to let you know when your site is experiencing a huge spike or drop in traffic. These alerts can also bring your attention to spikes of traffic caused by an influx of hits by bad spam bots. In this way you can take immediate steps to block these bots from damaging your site.
BLOCKING REFERRER SPAM
The 3 steps I have provided above will help you filter referrer spam from your Google Analytics. However, these methods do not stop the spam from crawling your site to begin with, and they do not block spam from hitting your web server. To do this you will need to block the crawler referral spam completely.
There are several methods of blocking crawler referral spam before it reaches your Google Analytics report, and these include adding code to your .htaccess file, deflecting spam traffic, changing your tracking I.D, adding a blacklist of referrers, and using WordPress plug-ins.
Each of these methods has its benefits and drawbacks, and if you would like to know more about these methods, you can check out the helpful resources below:
The Valid Hostname Filter will include valid hostnames and exclude spammer hostnames, and the Crawler Spam Filter will exclude all traffic coming from spam bots that are crawling through your site. When you use these 2 filters with the bots and spiders exclusion setting, you will efficiently filter out referral spam from your Google Analytics.
No more struggling with inflated traffic amounts and corrupted data. By applying the simple filters above, you can ensure that you only receive a clear and true depiction of your traffic and audience behavior (bounce rate, duration, sessions).
Before you leave, drop a comment below and tell me if the filters are working for you. Also share this with your friends, so that they know how to keep their GA safe from those pain in the ass spammers. And don’t forget to subscribe to my spam-free newsletter for all things Business Broken Down.
It was good talking to you, and I hope to see you next week.
42 thoughts on “How to Filter Out Referrer Spam in Google Analytics”
Great post! Perfect for bloggers, and I will for sure be using these tips on my blog!
I’m glad you enjoyed the post. Let me know if you need any more help on the filters.
Wow. Great research and awesome information!
I’m glad you are happy with the information. I hope you can use it to sort out any spam that you find on your Google Analytics.
Bookmarked – this looks very useful, thanks! I was getting loads of referrer spam at first, and I’ve definitely seen ‘buttons-for-website.com’ and ‘best-seo-offer.com’ many times. I blocked the worst ones, including those, and it hasn’t been as bad lately (fingers crossed) but I’ll keep an eye out for these referrers now!
Happy that you found the information useful. I was also plagued with referrer spam, but now that I’ve put in the filters I haven’t gotten any. If you have the right filters then maintaining a spam-free GA will be easy.
Wow! Geesh… so much to learn. I had no idea referral spam was a thing. Thank you for teaching me not only what that is but also how to get rid of it
Hi Mama Munchkin,
I was also in the dark about referrer spam until just a few days ago. Once I figured out that this was happening under my nose, I realized that it was probably happening to others. Hopefully you can set up the filters I showed you to get rid of this headache.
What a great post, i’ve always wondered about referrer traffic but have never really looked into it that deeply, i will now, it’s on the list
Thank you; I’m glad you got to know what referrer spam is. Setting up the filters shouldn’t take long, so good luck with that.
A very useful post – I had no idea about all of this! I did start glazing over because it’s a bit too technical for me, so I’ll pass this over to my VAs to get done instead. 😉
I’m glad you found the post useful. It does get technical in some places, but I hope the screen shots will help guide you/or your VAs if you get stuck.
Awesome write up indeed. Thank you for sharing your knowledge.
When I try to verify the Crawler Spam Filter it gives me this error; “This filter would not have changed your data. Either the filter configuration is incorrect, or the set of sampled data is too small.”
It would be great if you could help.
I’m glad you enjoyed the post. I got that notification as well. The verification filter works by analyzing the last 7 days of your GA data. On some days your crawler spam may be negligible, meaning that your data is barely affected- hence the statement ‘or the set of sampled data is too small’.
If you have created the filter correctly you have nothing to worry about. Just give it 2 days and check your GA referral traffic to see if there’s any more spam. If there isn’t then the filter is working, if there is, then contact me via the blog or social media and we’ll go over the filters together.
Thanks for the prompt feedback. I appreciate it, Davina. I shall monitor over next few days and get back to you on this one.
Your welcome Rushi. Will wait to hear from you 🙂
Thanks for this Davina. I’ll be passing the details to my tech person and preparing my ego for the bad news when the correction is applied to my stats.
A good tub of ice cream should make the traffic drop more manageable. P.S. If you are using a good hosting platform, then your crawler spam should be minimal. My ghost spam was distorting my results, but my crawler spam was negligible. Good luck.
It’s always nice to get more info related to anything Google! Thank you for sharing and your your help!!
I’m glad you found the post helpful. There’s another article on translating Google Analytics data into traffic and revenue, if you want more information on this subject.
Wow, this is incredibly helpful. I will be pinning this on my blogging board on pinterest to refer to later when I have the time (and patience) to cut out the referrer spam. I remember when I saw my google analytics once compared to my WordPress I was really confused about the disparity…now I understand.
I’m glad you found the article helpful, and thank you for sharing it on Pinterest. I also noticed the difference in my stats, and hopefully these filters will align your data as they have with mine.
Thanks for sharing! I never even knew this existed so i’ll be sure to look into it now 🙂
I’m happy to help. I hope the guide will make it easy for you to put in the filters.
This is SO helpful! I was looking into my analytics just last night and was wondering what the floating social icons was!
Hi Lee Anne,
I’m glad you found this helpful. If you need any other assistance with Google Analytics, feel free to contact me.
Wow, so much information! Thanks and now I will go look at ours
It’s definitely a lot to take in at first, but once you put in the filters maintenance becomes easy.
Thanks so much for providing some much needed and very helpful information!
Your very welcome, thank you for taking time to read the post.
The worst part of referral spam (to me) is the inane and non relevant comments it leaves which then I have to go and delete. Thanks for the step-by-step on how to do this!
Hate those comments. I would advise you to use the Akismet plug-in, which effectively blocks/filters all spam so that you don’t have to go through the hassle of deleting them yourself. Your welcome, thanks for stopping by 🙂
Thank you for this step by step post, I am bookingmarking so I can use it this weekend to clean up my GA
Your welcome 🙂 If you need any extra help with cleaning out your GA, let me know.
Ahhh we’ve been looking for a tutorial like this for so long! Thanks for sharing; this is great!
Glad to help 🙂 If you need any extra help with the filters, let me know.
Very interesting post. I had no idea of this. I am not very technical but this is one thing I am going to work on. I also Stumbled it
I’m happy that you found the post interesting. I hope the pictures will help you along as you add the filters, and thank you for stumbling it.
Really Good article. You explained it well and this will help a lot of users who do not know how to block referral spam in google analytics. Also you can block these automatically by adding Cloudflare to your site. Cloudflare is free and it will eliminate 95% of referral spam and Ghost spam but the remaining you have to block manually.
Thank you Chaithanya, I’m glad to hear that the guide is helpful. I’ll definitely check out Cloudfare, thank you for pointing it out.
Very nice, neatly written and highly elaborated presentation.
So useful for beginners.
Thank you so much for such a great post.
Thank you Prof. I’m glad you enjoyed it 🙂