How to Filter Out Referrer Spam in Google Analytics

I get a healthy amount of referral traffic, but for a long time I have not known what this means. Social traffic clearly comes from social media pages, Organic Search traffic comes from search engines like Google, and Direct traffic is usually a combination of unreported organic search, direct typing of your URL, and bookmarks.

Then there is Referral traffic which is defined as the following:

Google’s method of reporting visits that came to your site from sources outside of its search engine. When someone clicks on a hyperlink to go to a new page on a different website, Analytics tracks the click as a referral visit to the second site.

 

When I reviewed my Google Analytics Acquisition data (which provides valuable information for your blog/business), I found that a large amount of traffic was as a result of referral traffic channels. Once I dug deeper, I found that these sources included ‘floating share buttons’, ‘get free social traffic’, and other obscure sources.So what does this mean exactly? Well, this is all

So what does this mean exactly? Well, this is all referrer spam.

 

What is Referrer Spam?

Referrer spam is fake traffic from spam bots, which mimic a referral link. In this case a spammer makes multiple requests to your website, using a fake referrer URL to a site that they want to advertise. The aim of spammers is to promote their site, to index the contents of your website, and to build links to their site so that they can improve their search engine rankings.

Additionally, as spam bots crawl through your site, they eat up bandwidth and slow your site down. Furthermore, some spam bots also identify (and sometimes attack) vulnerabilities on your site.

Now, there are 2 main types of referrer spam; Ghost referrer spam and Non-Ghost referrer spam (crawlers).

 

The Ugly Side of Referrer Spam

Someone somewhere directs spam to Google Analytics accounts, leaving laymen like me feeling like traffic royalty. This referrer spam distorts your Google Analytics data, making your research and analysis corrupt and redundant. This can be especially crippling for small businesses who are not receiving a lot of traffic to begin with.

At the moment Google has left us in the dark, meaning that we have to stop this referral spam ourselves. There are many ways of doing this such as using .htacess, but I don’t really understand this method, and I’m wary of making coding changes.

Thankfully, you can block referrer spam right from your Google Analytics (GA) account. Granted, the drop in traffic will be a blow to your ego, however, it should motivate you to work harder on boosting your traffic. Below I will define referrer spam, and then show you how to filter this spam effectively.

 

STEP 1: FILTERING GHOST REFERRER SPAM

What is ghost referral spam?
Ghost spam is spam bots that never interact with your site. These spammers send data directly to your GA servers, so that they appear as referral traffic in your reports.

Therefore you are getting referral traffic like ‘floating share buttons’ from spammers who have never accessed your site.

We won’t get into the technicalities of how these spammers achieve this, but in summary they randomly generate your GA ID (UA-XXXXXx-1), and thereafter they send you the fake referral traffic.

Because this type of spam does not pass through your website, you cannot block it through coding and plug-ins. The only way to stop it from messing with your data is to use the filter that I will provide below.

 

How to identify ghost spam
Ghost spam tends to be sporadic, so when you check your GA data you will find that there will be spikes of ghost spam, followed by several days of no ghost spam. Spotting ghost spam is not difficult, as the data provided is clearly false.

To identify ghost spam take the following actions:

1. Go to Reporting> Acquisition> All Channels> Referral
2. Once you are here click Secondary Dimensions> Behavior> Hostnames
3. Your screen will split into Source and Hostnames, and you can easily see which sources and hostnames look out of place (a list of some of the most common ghost spam sources is provided below).

Additionally, you will find that most of the ghost spam sources give you a ‘not set’ value when it comes to the corresponding hostnames.

P.S. Most ghost spam has a bounce rate of 0% or 100%.

 

How to identify valid hostnames

Before you create a filter for ghost spam, you need to identify your valid hostnames. You can do this easily by taking the following actions:

 

1. Go to Reporting> Audience> Technology> Network
2. Select Hostname as the Primary Dimension

3. Copy down all the valid hostnames that you see. The valid hostnames are where your GA tracking code appears, so all your valid hostnames should have a direct connection to your site. A hostname like amazon.com is not valid, as you have not put your tracking code on this site.

These are some valid hostnames that will appear:

http://www.yourdomain.com, yourdomain.com, blog.yourdomain.com, support.yourdomain.com, shoppingcart.com, translate.googleusercontent.com (this is used by visitors coming from other countries who need translation services).

 

Creating a customer filter for ghost spam
The following filter will take care of all your ghost spam, regardless of whether it appears in your referral, direct, or organic traffic channels. This filter is called the Valid Hostnames Filter, and it is the brainchild of Carlos Escalera at Ohow.

As I mentioned above, ghost spam uses invalid hostnames. Therefore, this filter works to include only valid hostnames in your GA report. To create the filter take the following actions:

1. Create an expression that includes all the valid hostnames that you have gathered. The expression will appear like this:
yourdomain.com|translateservice.com|yourshoppingcart.com|otherservice.net|otherdomain.com
2. Go to Admin> Filters> New Filters> Create New Filter
3. Enter ‘Valid Hostnames’ as the Filter Name
4. Filter Type> Custom> Include
Filter Field> Hostname

5. Copy the valid hostnames expression that you created in point 1, and then paste it in the box marked Filter Pattern.
6. Click on verify this filter before you save it. You should see a table showing your data, before and after you apply the filters.
7. If the table looks good, then you can click on save.

P.S. If you add your GA tracking I.D. to video or ecommerce services like PayPal and YouTube, make sure that you add the relevant hostname to the filter above. This way that traffic can be recognized.

 

Ghost Spam Worst Offenders
video–production.com get-free-social-traffic.com wpsecuritycheck.co.uk
buttons-for-website.com chinese-amezon.com wpthemedetector.co.uk
how-to-earn-quick-money.com satellite.maps.ilovevitaly.com erot.co
hongfanji.com traffic2money.com webmonetizer.net
sexyali.com site#.floating-share-buttons.com howtostopreferralspam.eu
free-floating-buttons.com e-buyeasy.com trafficmonetizer.org
wpsecuritycheck.co.uk wpthemedetector.co.uk trafficmonetize.org
непереводимая.рф sanjosestartups.com depositfiles-porn.ga
непереводимая.рф site1.floating-share-buttons.com site2.floating-share-buttons.com
4webmasters.org site3.floating-share-buttons.com s.click.aliexpress.com
websites-reviews.com youporn-forum.ga event-tracking.com
webmaster-traffic.com buy-cheap-online.info simple-share-buttons.com
vitaly rules google Get-Free-Traffic-Now.com social-buttons.com
rapidgator-porn.ga addons.mozilla.org s.click.aliexpress.com
meendo-free-traffic.ga googlsucks.com o-o-8-o-o.com
humanorightswatch.org darodar.com resellerclub scam
o-o-6-o-o.com / referral hulfingtonpost.com forum20.smailik.org
bestwebsitesawards.com ilovevitaly.com torture.ml
resellerclub scam blackhatworth.com amanda-porn.ga

 

STEP 2: FILTERING NON-GHOST REFERRER SPAM

What is non-ghost referrer spam?
Non-ghost referrer spam is also known as crawler referrer spam. While good web crawlers like Google Bots crawl your site so that they can index your content for the search engines, crawler referrer spam browses your site with different intentions e.g. getting your web property I.D. and sending traffic to their own site.

While ghost spam does not visit your site, crawlers visit your page and can do damage. Even worse is the fact that crawlers use valid domains e.g. apple.com, so that they don’t look out of place in your GA reports. But fear not, I’ll show you how to identify them in a few seconds.

 

How to identify non-ghost spam
Unlike ghost spam, crawlers use valid hostnames to send fake referral traffic to your site. Therefore you won’t be able to identify them using the method I gave above. Instead, here is how to spot these smart bots:

1. Go to Acquisition> All Traffic> Channels> Referrals
2. Select Secondary Dimensions> Behavior> Hostnames
3. Use the list provided below to identify which crawler spam is using a valid hostname
4. List the domains of all the crawler spam in a word/excel document

 

Creating a customer filter for non-ghost spam
Unlike ghost spam, crawler referrer spam visits your site. This means that you can block them through coding and plug-ins. However, below I will show you how to filter the spam out from your GA. Like the filter above, the Crawler Spam Filter comes courtesy of Carlos Escalera.

1. Go to Admin> Filters> New Filter
2. Enter ‘Crawler Spam’ as the Filter name
3. Filter Type> Custom> Exclude
Filter Field> Campaign Source
4. Create a regular expression, which includes the domains of the crawler spam that you have already identified. The format is the same, and it should look something like this:
buttons-for-website.com|success-seo.com|forum69.com
5. Insert this regular expression in the Filter Pattern box
6. Verify the filter, then click save.

 

Crawler Spam Worst Offenders

 

7makemoneyonline.com best-seo-offer.com dailyrank.net
anticrawler.org 100dollars-seo.com sitevaluation.org
baixar-musicas-gratis.com forum69.info semalt.semalt.com
descargar-musica-gratis.net best-seo-solution.com semalt.com
buttons-for-website.com buttons-for-your-website.com semaltmedia.com

 

STEP 3: BLOCKING BOTS AND SPIDERS

These bots and spiders are not harmful, as they help with your search engine rankings. Instead of blocking them, you can exclude them by following the easy steps below:

Go to Admin> View Settings> Check box marked ‘Exclude all hits from known bots and spiders’> Save

EXTRA TIPS FOR FILTERING REFERRER SPAM

1. Filters permanently alter your data, so before you create any filters create a new GA view. This way you can have GA data that is completely unfiltered. You can do this by simply going to Admin> View Setting> Copy View.

2. Once you identify ghost spam in your Google Analytics report, do not try and visit these sites . While some spam bots are harmless, many of them are looking to intentionally install malware on your computer.

3. Spam bots target weak and vulnerable sites more often than they do protected sites. You should therefore look at investing in quality hosting, to reduce the frequency of these attacks.

4. Check your GA report for new referrer spam on a monthly basis, and then update your filters accordingly. Depending on the size of your site, you can choose to ignore the spam bots sending negligible traffic, and block the ones that are sending large amounts of fake traffic to your site.

5. Use a firewall for your site; this acts as the first line of defense for bad spam bots.

6. In my previous post I talked about using custom alerts to monitor unusual shifts in traffic patterns. The purpose of custom alerts is to let you know when your site is experiencing a huge spike or drop in traffic. These alerts can also bring your attention to spikes of traffic caused by an influx of hits by bad spam bots. In this way you can take immediate steps to block these bots from damaging your site.

 

BLOCKING REFERRER SPAM

The 3 steps I have provided above will help you filter referrer spam from your Google Analytics. However, these methods do not stop the spam from crawling your site to begin with, and they do not block spam from hitting your web server. To do this you will need to block the crawler referral spam completely.

There are several methods of blocking crawler referral spam before it reaches your Google Analytics report, and these include adding code to your .htaccess file, deflecting spam traffic, changing your tracking I.D, adding a blacklist of referrers, and using WordPress plug-ins.

Each of these methods has its benefits and drawbacks, and if you would like to know more about these methods, you can check out the helpful resources below:

 

RESOURCES
1. https://moz.com/blog/how-to-stop-spam-bots-from-ruining-your-analytics-referral-data
2. http://blog.raventools.com/stop-referrer-spam/
3. https://en.wikipedia.org/wiki/Referer_spam
4. http://www.optimizesmart.com/geek-guide-removing-referrer-spam-google-analytics/
5. http://www.ohow.co/what-is-referrer-spam-how-stop-it-guide/
6. http://www.analyticsedge.com/2014/12/removing-referral-spam-google-analytics/

 

Final Verdict

The Valid Hostname Filter will include valid hostnames and exclude spammer hostnames, and the Crawler Spam Filter will exclude all traffic coming from spam bots that are crawling through your site. When you use these 2 filters with the bots and spiders exclusion setting, you will efficiently filter out referral spam from your Google Analytics.

No more struggling with inflated traffic amounts and corrupted data. By applying the simple filters above, you can ensure that you only receive a clear and true depiction of your traffic and audience behavior (bounce rate, duration, sessions).

Before you leave, drop a comment below and tell me if the filters are working for you. Also share this with your friends, so that they know how to keep their GA safe from those pain in the ass spammers. And don’t forget to subscribe to my spam-free newsletter for all things Business Broken Down.

It was good talking to you, and I hope to see you next week.

 

42 thoughts on “How to Filter Out Referrer Spam in Google Analytics

  1. Bookmarked – this looks very useful, thanks! I was getting loads of referrer spam at first, and I’ve definitely seen ‘buttons-for-website.com’ and ‘best-seo-offer.com’ many times. I blocked the worst ones, including those, and it hasn’t been as bad lately (fingers crossed) but I’ll keep an eye out for these referrers now!

    Like

    1. Hi,

      Happy that you found the information useful. I was also plagued with referrer spam, but now that I’ve put in the filters I haven’t gotten any. If you have the right filters then maintaining a spam-free GA will be easy.

      Asante,
      Davina

      Like

    1. Hi Mama Munchkin,

      I was also in the dark about referrer spam until just a few days ago. Once I figured out that this was happening under my nose, I realized that it was probably happening to others. Hopefully you can set up the filters I showed you to get rid of this headache.

      Asante,
      Davina

      Like

  2. Hi Sophia

    A very useful post – I had no idea about all of this! I did start glazing over because it’s a bit too technical for me, so I’ll pass this over to my VAs to get done instead. 😉

    Thanks, Una

    Like

    1. Hi Una,

      I’m glad you found the post useful. It does get technical in some places, but I hope the screen shots will help guide you/or your VAs if you get stuck.

      Asante,
      Davina

      Like

  3. Awesome write up indeed. Thank you for sharing your knowledge.
    When I try to verify the Crawler Spam Filter it gives me this error; “This filter would not have changed your data. Either the filter configuration is incorrect, or the set of sampled data is too small.”

    It would be great if you could help.

    Like

    1. Hi Rushi,

      I’m glad you enjoyed the post. I got that notification as well. The verification filter works by analyzing the last 7 days of your GA data. On some days your crawler spam may be negligible, meaning that your data is barely affected- hence the statement ‘or the set of sampled data is too small’.

      If you have created the filter correctly you have nothing to worry about. Just give it 2 days and check your GA referral traffic to see if there’s any more spam. If there isn’t then the filter is working, if there is, then contact me via the blog or social media and we’ll go over the filters together.

      Asante,
      Davina

      Like

      1. Thanks for the prompt feedback. I appreciate it, Davina. I shall monitor over next few days and get back to you on this one.
        Thanks again.

        Like

    1. Hi David,

      A good tub of ice cream should make the traffic drop more manageable. P.S. If you are using a good hosting platform, then your crawler spam should be minimal. My ghost spam was distorting my results, but my crawler spam was negligible. Good luck.

      Asante,
      Davina

      Like

  4. Wow, this is incredibly helpful. I will be pinning this on my blogging board on pinterest to refer to later when I have the time (and patience) to cut out the referrer spam. I remember when I saw my google analytics once compared to my WordPress I was really confused about the disparity…now I understand.

    Like

    1. Hi Sam,

      I’m glad you found the article helpful, and thank you for sharing it on Pinterest. I also noticed the difference in my stats, and hopefully these filters will align your data as they have with mine.

      Asante,
      Davina

      Like

    1. Hi Katy,

      Hate those comments. I would advise you to use the Akismet plug-in, which effectively blocks/filters all spam so that you don’t have to go through the hassle of deleting them yourself. Your welcome, thanks for stopping by 🙂

      Asante,
      Davina

      Like

    1. Hi Maria,

      I’m happy that you found the post interesting. I hope the pictures will help you along as you add the filters, and thank you for stumbling it.

      Asante,
      Davina

      Like

  5. Really Good article. You explained it well and this will help a lot of users who do not know how to block referral spam in google analytics. Also you can block these automatically by adding Cloudflare to your site. Cloudflare is free and it will eliminate 95% of referral spam and Ghost spam but the remaining you have to block manually.

    Like

What do you think?