Stop Ghost Spam in Google Analytics with One Filter

Posted by CarloSeo

The spam in Google Analytics (GA) is becoming a serious issue. Due to a deluge of referral spam from social buttons, adult sites, and many, many other sources, people are starting to become overwhelmed by all the filters they are setting up to manage the useless data they are receiving.

The good news is, there is no need to panic. In this post, I’m going to focus on the most common mistakes people make when fighting spam in GA, and explain an efficient way to prevent it.

But first, let’s make sure we understand how spam works. A couple of months ago, Jared Gardner wrote an excellent article explaining what referral spam is, including its intended purpose. He also pointed out some great examples of referral spam.

Types of spam

The spam in Google Analytics can be categorized by two types: ghosts and crawlers.

Ghosts

The vast majority of spam is this type. They are called ghosts because they never access your site. It is important to keep this in mind, as it’s key to creating a more efficient solution for managing spam.

As unusual as it sounds, this type of spam doesn’t have any interaction with your site at all. You may wonder how that is possible since one of the main purposes of GA is to track visits to our sites.

They do it by using the Measurement Protocol, which allows people to send data directly to Google Analytics’ servers. Using this method, and probably randomly generated tracking codes (UA-XXXXX-1) as well, the spammers leave a “visit” with fake data, without even knowing who they are hitting.

Crawlers

This type of spam, the opposite to ghost spam, does access your site. As the name implies, these spam bots crawl your pages, ignoring rules like those found in robots.txt that are supposed to stop them from reading your site. When they exit your site, they leave a record on your reports that appears similar to a legitimate visit.

Crawlers are harder to identify because they know their targets and use real data. But it is also true that new ones seldom appear. So if you detect a referral in your analytics that looks suspicious, researching it on Google or checking it against this list might help you answer the question of whether or not it is spammy.

Most common mistakes made when dealing with spam in GA

I’ve been following this issue closely for the last few months. According to the comments people have made on my articles and conversations I’ve found in discussion forums, there are primarily three mistakes people make when dealing with spam in Google Analytics.

Mistake #1. Blocking ghost spam from the .htaccess file

One of the biggest mistakes people make is trying to block Ghost Spam from the .htaccess file.

For those who are not familiar with this file, one of its main functions is to allow/block access to your site. Now we know that ghosts never reach your site, so adding them here won’t have any effect and will only add useless lines to your .htaccess file.

Ghost spam usually shows up for a few days and then disappears. As a result, sometimes people think that they successfully blocked it from here when really it’s just a coincidence of timing.

Then when the spammers later return, they get worried because the solution is not working anymore, and they think the spammer somehow bypassed the barriers they set up.

The truth is, the .htaccess file can only effectively block crawlers such as buttons-for-website.com and a few others since these access your site. Most of the spam can’t be blocked using this method, so there is no other option than using filters to exclude them.

Mistake #2. Using the referral exclusion list to stop spam

Another error is trying to use the referral exclusion list to stop the spam. The name may confuse you, but this list is not intended to exclude referrals in the way we want to for the spam. It has other purposes.

For example, when a customer buys something, sometimes they get redirected to a third-party page for payment. After making a payment, they’re redirected back to you website, and GA records that as a new referral. It is appropriate to use referral exclusion list to prevent this from happening.

If you try to use the referral exclusion list to manage spam, however, the referral part will be stripped since there is no preexisting record. As a result, a direct visit will be recorded, and you will have a bigger problem than the one you started with since. You will still have spam, and direct visits are harder to track.

Mistake #3. Worrying that bounce rate changes will affect rankings

When people see that the bounce rate changes drastically because of the spam, they start worrying about the impact that it will have on their rankings in the SERPs.

bounce.png

This is another mistake commonly made. With or without spam, Google doesn’t take into consideration Google Analytics metrics as a ranking factor. Here is an explanation about this from Matt Cutts, the former head of Google’s web spam team.

And if you think about it, Cutts’ explanation makes sense; because although many people have GA, not everyone uses it.

Assuming your site has been hacked

Another common concern when people see strange landing pages coming from spam on their reports is that they have been hacked.

landing page

The page that the spam shows on the reports doesn’t exist, and if you try to open it, you will get a 404 page. Your site hasn’t been compromised.

But you have to make sure the page doesn’t exist. Because there are cases (not spam) where some sites have a security breach and get injected with pages full of bad keywords to defame the website.

What should you worry about?

Now that we’ve discarded security issues and their effects on rankings, the only thing left to worry about is your data. The fake trail that the spam leaves behind pollutes your reports.

It might have greater or lesser impact depending on your site traffic, but everyone is susceptible to the spam.

Small and midsize sites are the most easily impacted – not only because a big part of their traffic can be spam, but also because usually these sites are self-managed and sometimes don’t have the support of an analyst or a webmaster.

Big sites with a lot of traffic can also be impacted by spam, and although the impact can be insignificant, invalid traffic means inaccurate reports no matter the size of the website. As an analyst, you should be able to explain what’s going on in even in the most granular reports.

You only need one filter to deal with ghost spam

Usually it is recommended to add the referral to an exclusion filter after it is spotted. Although this is useful for a quick action against the spam, it has three big disadvantages.

  • Making filters every week for every new spam detected is tedious and time-consuming, especially if you manage many sites. Plus, by the time you apply the filter, and it starts working, you already have some affected data.
  • Some of the spammers use direct visits along with the referrals.
  • These direct hits won’t be stopped by the filter so even if you are excluding the referral you will sill be receiving invalid traffic, which explains why some people have seen an unusual spike in direct traffic.

Luckily, there is a good way to prevent all these problems. Most of the spam (ghost) works by hitting GA’s random tracking-IDs, meaning the offender doesn’t really know who is the target, and for that reason either the hostname is not set or it uses a fake one. (See report below)

Ghost-Spam.png

You can see that they use some weird names or don’t even bother to set one. Although there are some known names in the list, these can be easily added by the spammer.

On the other hand, valid traffic will always use a real hostname. In most of the cases, this will be the domain. But it also can also result from paid services, translation services, or any other place where you’ve inserted GA tracking code.

Valid-Referral.png

Based on this, we can make a filter that will include only hits that use real hostnames. This will automatically exclude all hits from ghost spam, whether it shows up as a referral, keyword, or pageview; or even as a direct visit.

To create this filter, you will need to find the report of hostnames. Here’s how:

  1. Go to the Reporting tab in GA
  2. Click on Audience in the lefthand panel
  3. Expand Technology and select Network
  4. At the top of the report, click on Hostname

Valid-list

You will see a list of all hostnames, including the ones that the spam uses. Make a list of all the valid hostnames you find, as follows:

  • yourmaindomain.com
  • blog.yourmaindomain.com
  • es.yourmaindomain.com
  • payingservice.com
  • translatetool.com
  • anotheruseddomain.com

For small to medium sites, this list of hostnames will likely consist of the main domain and a couple of subdomains. After you are sure you got all of them, create a regular expression similar to this one:

yourmaindomain\.com|anotheruseddomain\.com|payingservice\.com|translatetool\.com

You don’t need to put all of your subdomains in the regular expression. The main domain will match all of them. If you don’t have a view set up without filters, create one now.

Then create a Custom Filter.

Make sure you select INCLUDE, then select “Hostname” on the filter field, and copy your expression into the Filter Pattern box.

filter

You might want to verify the filter before saving to check that everything is okay. Once you’re ready, set it to save, and apply the filter to all the views you want (except the view without filters).

This single filter will get rid of future occurrences of ghost spam that use invalid hostnames, and it doesn’t require much maintenance. But it’s important that every time you add your tracking code to any service, you add it to the end of the filter.

Now you should only need to take care of the crawler spam. Since crawlers access your site, you can block them by adding these lines to the .htaccess file:

## STOP REFERRER SPAM 
RewriteCond %{HTTP_REFERER} semalt\.com [NC,OR] 
RewriteCond %{HTTP_REFERER} buttons-for-website\.com [NC] 
RewriteRule .* - [F]

It is important to note that this file is very sensitive, and misplacing a single character it it can bring down your entire site. Therefore, make sure you create a backup copy of your .htaccess file prior to editing it.

If you don’t feel comfortable messing around with your .htaccess file, you can alternatively make an expression with all the crawlers, then and add it to an exclude filter by Campaign Source.

Implement these combined solutions, and you will worry much less about spam contaminating your analytics data. This will have the added benefit of freeing up more time for you to spend actually analyze your valid data.

After stopping spam, you can also get clean reports from the historical data by using the same expressions in an Advance Segment to exclude all the spam.

Bonus resources to help you manage spam

If you still need more information to help you understand and deal with the spam on your GA reports, you can read my main article on the subject here: http://www.ohow.co/what-is-referrer-spam-how-stop-it-guide/.

Additional information on how to stop spam can be found at these URLs:

In closing, I am eager to hear your ideas on this serious issue. Please share them in the comments below.

(Editor’s Note: All images featured in this post were created by the author.)

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

​The 2015 Online Marketing Industry Survey

Posted by Dr-Pete

It’s been another wild year in search marketing. Mobilegeddon crushed our Twitter streams, but not our dreams, and Matt Cutts stepped out of the spotlight to make way for an uncertain Google future. Pandas and Penguins continue to torment us, but most days, like anyone else, we were just trying to get the job done and earn a living.

This year, over 3,600 brave souls, each one more intelligent and good-looking than the last, completed our survey. While the last survey was technically “2014”, we collected data for it in late 2013, so the 2015 survey reflects about 18 months of industry changes.

A few highlights

Let’s dig in. Almost half (49%) of our 2015 respondents involved in search marketing were in-house marketers. In-house teams still tend to be small – 71% of our in-house marketers reported only 1-3 people in their company being involved in search marketing at least quarter-time. These teams do have substantial influence, though, with 86% reporting that they were involved in purchasing decisions.

Agency search marketers reported larger teams and more diverse responsibilities. More than one-third (36%) of agency marketers in our survey reported working with more than 20 clients in the previous year. Agencies covered a wide range of services, with the top 5 being:

More than four-fifths (81%) of agency respondents reported providing both SEO and SEM services for clients. Please note that respondents could select more than one service/tool/etc., so the charts in this post will not add up to 100%.

The vast majority of respondents (85%) reported being directly involved with content marketing, which was on par with 2014. Nearly two-thirds (66%) of agency content marketers reported “Content for SEO purposes” as their top activity, although “Building Content Strategy” came in a solid second at 44% of respondents.

Top tools

Where do we get such wonderful toys? We marketers love our tools, so let’s take a look at the Top 10 tools across a range of categories. Please note that this survey was conducted here on Moz, and our audience certainly has a pro-Moz slant.

Up first, here are the Top 10 SEO tools in our survey:

Just like last time, Google Webmaster Tools (now “Search Console”) leads the way. Moz Pro and Majestic slipped a little bit, and Firebug fell out of the Top 10. The core players remained fairly stable.

Here are the Top 10 Content tools in our survey:

Even with its uncertain future, Google Alerts continues to be widely used. There are a lot of newcomers to the content tools world, so year-over-year comparisons are tricky. Expect even more players in this market in the coming year.

Following are our respondents’ Top 10 analytics tools:

For an industry that complains about Google so much, we sure do seem to love their stuff. Google Analytics dominates, crushing the enterprise players, at least in the mid-market. KISSmetrics gained solid ground (from the #10 spot last time), while home-brewed tools slipped a bit. CrazyEgg and WordPress Stats remain very popular since our last survey.

Finally, here are the Top 10 social tools used by our respondents:

Facebook Insights and Hootsuite retained the top spots from last year, but newcomer Twitter Analytics rocketed into the #3 position. LinkedIn Insights emerged as a strong contender, too. Overall usage of all social tools increased. Tweetdeck held the #6 spot in 2014, with 19% usage, but dropped to #10 this year, even bumping up slightly to 20%.

Of course, digging into social tools naturally begs the question of which social networks are at the top of our lists.

The Top 6 are unchanged since our last survey, and it’s clear that the barriers to entry to compete with the big social networks are only getting higher. Instagram doubled its usage (from 11% of respondents last time), but this still wasn’t enough to overtake Pinterest. Reddit and Quora saw steady growth, and StumbleUpon slipped out of the Top 10.

Top activities

So, what exactly do we do with these tools and all of our time? Across all online marketers in our survey, the Top 5 activities were:

For in-house marketers, “Site Audits” dropped to the #6 position and “Brand Strategy” jumped up to the #3 spot. Naturally, in-house marketers have more resources to focus on strategy.

For agencies and consultants, “Site Audits” bumped up to #2, and “Managing People” pushed down social media to take the #5 position. Larger agency teams require more traditional people wrangling.

Here’s a much more detailed breakdown of how we spend our time in 2015:

In terms of overall demand for services, the Top 5 winners (calculated by % reporting increase – % reporting decrease were):

Demand for CRO is growing at a steady clip, but analytics still leads the way. Both “Content Creation” (#2) and “Content Curation” (#6) showed solid demand increases.

Some categories reported both gains and losses – 30% of respondents reported increased demand for “Link Building”, while 20% reported decreased demand. Similarly, 20% reported increased demand for “Link Removal”, while almost as many (17%) reported decreased demand. This may be a result of overall demand shifts, or it may represent more specialization by agencies and consultants.

What’s in store for 2016?

It’s clear that our job as online marketers is becoming more diverse, more challenging, and more strategic. We have to have a command of a wide array of tools and tactics, and that’s not going to slow down any time soon. On the bright side, companies are more aware of what we do, and they’re more willing to spend the money to have it done. Our evolution has barely begun as an industry, and you can expect more changes and growth in the coming year.

Raw data download

If you’d like to take a look through the raw results from this year’s survey (we’ve removed identifying information like email addresses from all responses), we’ve got that for you here:

Download the raw results

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

How Much Has Link Building Changed in Recent Years?

Posted by Paddy_Moogan

I get asked this question a lot. It’s mainly asked by people who are considering buying my link building book and want to know whether it’s still up to date. This is understandable given that the first edition was published in February 2013 and our industry has a deserved reputation for always changing.

I find myself giving the same answer, even though I’ve been asked it probably dozens of times in the last two years—”not that much”. I don’t think this is solely due to the book itself standing the test of time, although I’ll happily take a bit of credit for that 🙂 I think it’s more a sign of our industry as a whole not changing as much as we’d like to think.

I started to question myself and if I was right and honestly, it’s one of the reasons it has taken me over two years to release the second edition of the book.

So I posed this question to a group of friends not so long ago, some via email and some via a Facebook group. I was expecting to be called out by many of them because my position was that in reality, it hasn’t actually changed that much. The thing is, many of them agreed and the conversations ended with a pretty long thread with lots of insights. In this post, I’d like to share some of them, share what my position is and talk about what actually has changed.

My personal view

Link building hasn’t changed as much we think it has.

The core principles of link building haven’t changed. The signals around link building have changed, but mainly around new machine learning developments that have indirectly affected what we do. One thing that has definitely changed is the mindset of SEOs (and now clients) towards link building.

I think the last big change to link building came in April 2012 when Penguin rolled out. This genuinely did change our industry and put to bed a few techniques that should never have worked so well in the first place.

Since then, we’ve seen some things change, but the core principles haven’t changed if you want to build a business that will be around for years to come and not run the risk of being hit by a link related Google update. For me, these principles are quite simple:

  • You need to deserve links – either an asset you create or your product
  • You need to put this asset in front of a relevant audience who have the ability to share it
  • You need consistency – one new asset every year is unlikely to cut it
  • Anything that scales is at risk

For me, the move towards user data driving search results + machine learning has been the biggest change we’ve seen in recent years and it’s still going.

Let’s dive a bit deeper into all of this and I’ll talk about how this relates to link building.

The typical mindset for building links has changed

I think that most SEOs are coming round to the idea that you can’t get away with building low quality links any more, not if you want to build a sustainable, long-term business. Spammy link building still works in the short-term and I think it always will, but it’s much harder than it used to be to sustain websites that are built on spam. The approach is more “churn and burn” and spammers are happy to churn through lots of domains and just make a small profit on each one before moving onto another.

For everyone else, it’s all about the long-term and not putting client websites at risk.

This has led to many SEOs embracing different forms of link building and generally starting to use content as an asset when it comes to attracting links. A big part of me feels that it was actually Penguin in 2012 that drove the rise of content marketing amongst SEOs, but that’s a post for another day…! For today though, this goes some way towards explain the trend we see below.

Slowly but surely, I’m seeing clients come to my company already knowing that low quality link building isn’t what they want. It’s taken a few years after Penguin for it to filter down to client / business owner level, but it’s definitely happening. This is a good thing but unfortunately, the main reason for this is that most of them have been burnt in the past by SEO companies who have built low quality links without giving thought to building good quality ones too.

I have no doubt that it’s this change in mindset which has led to trends like this:

The thing is, I don’t think this was by choice.

Let’s be honest. A lot of us used the kind of link building tactics that Google no longer like because they worked. I don’t think many SEOs were under the illusion that it was genuinely high quality stuff, but it worked and it was far less risky to do than it is today. Unless you were super-spammy, the low-quality links just worked.

Fast forward to a post-Penguin world, things are far more risky. For me, it’s because of this that we see the trends like the above. As an industry, we had the easiest link building methods taken away from us and we’re left with fewer options. One of the main options is content marketing which, if you do it right, can lead to good quality links and importantly, the types of links you won’t be removing in the future. Get it wrong and you’ll lose budget and lose the trust if your boss or client in the power of content when it comes to link building.

There are still plenty of other methods to build links and sometimes we can forget this. Just look at this epic list from Jon Cooper. Even with this many tactics still available to us, it’s hard work. Way harder than it used to be.

My summary here is that as an industry, our mindset has shifted but it certainly wasn’t a voluntary shift. If the tactics that Penguin targeted still worked today, we’d still be using them.

A few other opinions…

I definitely think too many people want the next easy win. As someone surfing the edge of what Google is bringing our way, here’s my general take—SEO, in broad strokes, is changing a lot, *but* any given change is more and more niche and impacts fewer people. What we’re seeing isn’t radical, sweeping changes that impact everyone, but a sort of modularization of SEO, where we each have to be aware of what impacts our given industries, verticals, etc.”

Dr. Pete

 

I don’t feel that techniques for acquiring links have changed that much. You can either earn them through content and outreach or you can just buy them. What has changed is the awareness of “link building” outside of the SEO community. This makes link building / content marketing much harder when pitching to journalists and even more difficult when pitching to bloggers.

“Link building has to be more integrated with other channels and struggles to work in its own environment unless supported by brand, PR and social. Having other channels supporting your link development efforts also creates greater search signals and more opportunity to reach a bigger audience which will drive a greater ROI.

Carl Hendy

 

SEO has grown up in terms of more mature staff and SEOs becoming more ingrained into businesses so there is a smarter (less pressure) approach. At the same time, SEO has become more integrated into marketing and has made marketing teams and decision makers more intelligent in strategies and not pushing for the quick win. I’m also seeing that companies who used to rely on SEO and building links have gone through IPOs and the need to build 1000s of links per quarter has rightly reduced.

Danny Denhard

Signals that surround link building have changed

There is no question about this one in my mind. I actually wrote about this last year in my previous blog post where I talked about signals such as anchor text and deep links changing over time.

Many of the people I asked felt the same, here are some quotes from them, split out by the types of signal.

Domain level link metrics

I think domain level links have become increasingly important compared with page level factors, i.e. you can get a whole site ranking well off the back of one insanely strong page, even with sub-optimal PageRank flow from that page to the rest of the site.

Phil Nottingham

I’d agree with Phil here and this is what I was getting at in my previous post on how I feel “deep links” will matter less over time. It’s not just about domain level links here, it’s just as much about the additional signals available for Google to use (more on that later).

Anchor text

I’ve never liked anchor text as a link signal. I mean, who actually uses exact match commercial keywords as anchor text on the web?

SEOs. 🙂

Sure there will be natural links like this, but honestly, I struggle with the idea that it took Google so long to start turning down the dial on commercial anchor text as a ranking signal. They are starting to turn it down though, slowly but surely. Don’t get me wrong, it still matters and it still works. But like pure link spam, the barrier is a lot more lower now in terms what of constitutes too much.

Rand feels that they matter more than we’d expect and I’d mostly agree with this statement:

Exact match anchor text links still have more power than you’d expect—I think Google still hasn’t perfectly sorted what is “brand” or “branded query” from generics (i.e. they want to start ranking a new startup like meldhome.com for “Meld” if the site/brand gets popular, but they can’t quite tell the difference between that and https://moz.com/learn/seo/redirection getting a few manipulative links that say “redirect”)

Rand Fishkin

What I do struggle with though, is that Google still haven’t figured this out and that short-term, commercial anchor text spam is still so effective. Even for a short burst of time.

I don’t think link building as a concept has changed loads—but I think links as a signal have, mainly because of filters and penalties but I don’t see anywhere near the same level of impact from coverage anymore, even against 18 months ago.

Paul Rogers

New signals have been introduced

It isn’t just about established signals changing though, there are new signals too and I personally feel that this is where we’ve seen the most change in Google algorithms in recent years—going all the way back to Panda in 2011.

With Panda, we saw a new level of machine learning where it almost felt like Google had found a way of incorporating human reaction / feelings into their algorithms. They could then run this against a website and answer questions like the ones included in this post. Things such as:

  • “Would you be comfortable giving your credit card information to this site?”
  • “Does this article contain insightful analysis or interesting information that is beyond obvious?”
  • “Are the pages produced with great care and attention to detail vs. less attention to detail?”

It is a touch scary that Google was able to run machine learning against answers to questions like this and write an algorithm to predict the answers for any given page on the web. They have though and this was four years ago now.

Since then, they’ve made various moves to utilize machine learning and AI to build out new products and improve their search results. For me, this was one of the biggest and went pretty unnoticed by our industry. Well, until Hummingbird came along I feel pretty sure that we have Ray Kurzweil to thank for at least some of that.

There seems to be more weight on theme/topic related to sites, though it’s hard to tell if this is mostly link based or more user/usage data based. Google is doing a good job of ranking sites and pages that don’t earn the most links but do provide the most relevant/best answer. I have a feeling they use some combination of signals to say “people who perform searches like this seem to eventually wind up on this website—let’s rank it.” One of my favorite examples is the Audubon Society ranking for all sorts of birding-related searches with very poor keyword targeting, not great links, etc. I think user behavior patterns are stronger in the algo than they’ve ever been.

– Rand Fishkin

Leading on from what Rand has said, it’s becoming more and more common to see search results that just don’t make sense if you look at the link metrics—but are a good result.

For me, the move towards user data driving search results + machine learning advanced has been the biggest change we’ve seen in recent years and it’s still going.

Edit: since drafting this post, Tom Anthony released this excellent blog post on his views on the future of search and the shift to data-driven results. I’d recommend reading that as it approaches this whole area from a different perspective and I feel that an off-shoot of what Tom is talking about is the impact on link building.

You may be asking at this point, what does machine learning have to do with link building?

Everything. Because as strong as links are as a ranking signal, Google want more signals and user signals are far, far harder to manipulate than established link signals. Yes it can be done—I’ve seen it happen. There have even been a few public tests done. But it’s very hard to scale and I’d venture a guess that only the top 1% of spammers are capable of doing it, let alone maintaining it for a long period of time. When I think about the process for manipulation here, I actually think we go a step beyond spammers towards hackers and more cut and dry illegal activity.

For link building, this means that traditional methods of manipulating signals are going to become less and less effective as these user signals become stronger. For us as link builders, it means we can’t keep searching for that silver bullet or the next method of scaling link building just for an easy win. The fact is that scalable link building is always going to be at risk from penalization from Google—I don’t really want to live a life where I’m always worried about my clients being hit by the next update. Even if Google doesn’t catch up with a certain method, machine learning and user data mean that these methods may naturally become less effective and cost efficient over time.

There are of course other things such as social signals that have come into play. I certainly don’t feel like these are a strong ranking factor yet, but with deals like this one between Google and Twitter being signed, I wouldn’t be surprised if that ever-growing dataset is used at some point in organic results. The one advantage that Twitter has over Google is it’s breaking news freshness. Twitter is still way quicker at breaking news than Google is—140 characters in a tweet is far quicker than Google News! Google know this which is why I feel they’ve pulled this partnership back into existence after a couple of years apart.

There is another important point to remember here and it’s nicely summarised by Dr. Pete:

At the same time, as new signals are introduced, these are layers not replacements. People hear social signals or user signals or authorship and want it to be the link-killer, because they already fucked up link-building, but these are just layers on top of on-page and links and all of the other layers. As each layer is added, it can verify the layers that came before it and what you need isn’t the magic signal but a combination of signals that generally matches what Google expects to see from real, strong entities. So, links still matter, but they matter in concert with other things, which basically means it’s getting more complicated and, frankly, a bit harder. Of course, on one wants to hear that.”

– Dr. Pete

The core principles have not changed

This is the crux of everything for me. With all the changes listed above, the key is that the core principles around link building haven’t changed. I could even argue that Penguin didn’t change the core principles because the techniques that Penguin targeted should never have worked in the first place. I won’t argue this too much though because even Google advised website owners to build directory links at one time.

You need an asset

You need to give someone a reason to link to you. Many won’t do it out of the goodness of their heart! One of the most effective ways to do this is to develop a content asset and use this as your reason to make people care. Once you’ve made someone care, they’re more likely to share the content or link to it from somewhere.

You need to promote that asset to the right audience

I really dislike the stance that some marketers take when it comes to content promotion—build great content and links will come.

No. Sorry but for the vast majority of us, that’s simply not true. The exceptions are people that sky dive from space or have huge existing audiences to leverage.

You simply have to spend time promoting your content or your asset for it to get shares and links. It is hard work and sometimes you can spend a long time on it and get little return, but it’s important to keep working at until you’re at a point where you have two things:

  • A big enough audience where you can almost guarantee at least some traffic to your new content along with some shares
  • Enough strong relationships with relevant websites who you can speak to when new content is published and stand a good chance of them linking to it

Getting to this point is hard—but that’s kind of the point. There are various hacks you can use along the way but it will take time to get right.

You need consistency

Leading on from the previous point. It takes time and hard work to get links to your content—the types of links that stand the test of time and you’re not going to be removing in 12 months time anyway! This means that you need to keep pushing content out and getting better each and every time. This isn’t to say you should just churn content out for the sake of it, far from it. I am saying that with each piece of content you create, you will learn to do at least one thing better the next time. Try to give yourself the leverage to do this.

Anything scalable is at risk

Scalable link building is exactly what Google has been trying to crack down on for the last few years. Penguin was the biggest move and hit some of the most scalable tactics we had at our disposal. When you scale something, you often lose some level of quality, which is exactly what Google doesn’t want when it comes to links. If you’re still relying on tactics that could fall into the scalable category, I think you need to be very careful and just look at the trend in the types of links Google has been penalizing to understand why.

The part Google plays in this

To finish up, I want to briefly talk about the part that Google plays in all of this and shaping the future they want for the web.

I’ve always tried to steer clear of arguments involving the idea that Google is actively pushing FUD into the community. I’ve preferred to concentrate more on things I can actually influence and change with my clients rather than what Google is telling us all to do.

However, for the purposes of this post, I want to talk about it.

General paranoia has increased. My bet is there are some companies out there carrying out zero specific linkbuilding activity through worry.

Dan Barker

Dan’s point is a very fair one and just a day or two after reading this in an email, I came across a page related to a client’s target audience that said:

“We are not publishing guest posts on SITE NAME any more. All previous guest posts are now deleted. For more information, see www.mattcutts.com/blog/guest-blogging/“.

I’ve reworded this as to not reveal the name of the site, but you get the point.

This is silly. Honestly, so silly. They are a good site, publish good content, and had good editorial standards. Yet they have ignored all of their own policies, hard work, and objectives to follow a blog post from Matt. I’m 100% confident that it wasn’t sites like this one that Matt was talking about in this blog post.

This is, of course, from the publishers’ angle rather than the link builders’ angle, but it does go to show the effect that statements from Google can have. Google know this so it does make sense for them to push out messages that make their jobs easier and suit their own objectives—why wouldn’t they? In a similar way, what did they do when they were struggling to classify at scale which links are bad vs. good and they didn’t have a big enough web spam team? They got us to do it for them 🙂

I’m mostly joking here, but you see the point.

The most recent infamous mobilegeddon update, discussed here by Dr. Pete is another example of Google pushing out messages that ultimately scared a lot of people into action. Although to be fair, I think that despite the apparent small impact so far, the broad message from Google is a very serious one.

Because of this, I think we need to remember that Google does have their own agenda and many shareholders to keep happy. I’m not in the camp of believing everything that Google puts out is FUD, but I’m much more sensitive and questioning of the messages now than I’ve ever been.

What do you think? I’d love to hear your feedback and thoughts in the comments.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

I Can’t Drive 155: Meta Descriptions in 2015

Posted by Dr-Pete

For years now, we (and many others) have been recommending keeping your Meta Descriptions shorter than
about 155-160 characters. For months, people have been sending me examples of search snippets that clearly broke that rule, like this one (on a search for “hummingbird food”):

For the record, this one clocks in at 317 characters (counting spaces). So, I set out to discover if these long descriptions were exceptions to the rule, or if we need to change the rules. I collected the search snippets across the MozCast 10K, which resulted in 92,669 snippets. All of the data in this post was collected on April 13, 2015.

The Basic Data

The minimum snippet length was zero characters. There were 69 zero-length snippets, but most of these were the new generation of answer box, that appears organic but doesn’t have a snippet. To put it another way, these were misidentified as organic by my code. The other 0-length snippets were local one-boxes that appeared as organic but had no snippet, such as this one for “chichen itza”:

These zero-length snippets were removed from further analysis, but considering that they only accounted for 0.07% of the total data, they didn’t really impact the conclusions either way. The shortest legitimate, non-zero snippet was 7 characters long, on a search for “geek and sundry”, and appears to have come directly from the site’s meta description:

The maximum snippet length that day (this is a highly dynamic situation) was 372 characters. The winner appeared on a search for “benefits of apple cider vinegar”:

The average length of all of the snippets in our data set (not counting zero-length snippets) was 143.5 characters, and the median length was 152 characters. Of course, this can be misleading, since some snippets are shorter than the limit and others are being artificially truncated by Google. So, let’s dig a bit deeper.

The Bigger Picture

To get a better idea of the big picture, let’s take a look at the display length of all 92,600 snippets (with non-zero length), split into 20-character buckets (0-20, 21-40, etc.):

Most of the snippets (62.1%) cut off as expected, right in the 141-160 character bucket. Of course, some snippets were shorter than that, and didn’t need to be cut off, and some broke the rules. About 1% (1,010) of the snippets in our data set measured 200 or more characters. That’s not a huge number, but it’s enough to take seriously.

That 141-160 character bucket is dwarfing everything else, so let’s zoom in a bit on the cut-off range, and just look at snippets in the 120-200 character range (in this case, by 5-character bins):

Zooming in, the bulk of the snippets are displaying at lengths between about 146-165 characters. There are plenty of exceptions to the 155-160 character guideline, but for the most part, they do seem to be exceptions.

Finally, let’s zoom in on the rule-breakers. This is the distribution of snippets displaying 191+ characters, bucketed in 10-character bins (191-200, 201-210, etc.):

Please note that the Y-axis scale is much smaller than in the previous 2 graphs, but there is a pretty solid spread, with a decent chunk of snippets displaying more than 300 characters.

Without looking at every original meta description tag, it’s very difficult to tell exactly how many snippets have been truncated by Google, but we do have a proxy. Snippets that have been truncated end in an ellipsis (…), which rarely appears at the end of a natural description. In this data set, more than half of all snippets (52.8%) ended in an ellipsis, so we’re still seeing a lot of meta descriptions being cut off.

I should add that, unlike titles/headlines, it isn’t clear whether Google is cutting off snippets by pixel width or character count, since that cut-off is done on the server-side. In most cases, Google will cut before the end of the second line, but sometimes they cut well before this, which could suggest a character-based limit. They also cut off at whole words, which can make the numbers a bit tougher to interpret.

The Cutting Room Floor

There’s another difficulty with telling exactly how many meta descriptions Google has modified – some edits are minor, and some are major. One minor edit is when Google adds some additional information to a snippet, such as a date at the beginning. Here’s an example (from a search for “chicken pox”):

With the date (and minus the ellipsis), this snippet is 164 characters long, which suggests Google isn’t counting the added text against the length limit. What’s interesting is that the rest comes directly from the meta description on the site, except that the site’s description starts with “Chickenpox.” and Google has removed that keyword. As a human, I’d say this matches the meta description, but a bot has a very hard time telling a minor edit from a complete rewrite.

Another minor rewrite occurs in snippets that start with search result counts:

Here, we’re at 172 characters (with spaces and minus the ellipsis), and Google has even let this snippet roll over to a third line. So, again, it seems like the added information at the beginning isn’t counting against the length limit.

All told, 11.6% of the snippets in our data set had some kind of Google-generated data, so this type of minor rewrite is pretty common. Even if Google honors most of your meta description, you may see small edits.

Let’s look at our big winner, the 372-character description. Here’s what we saw in the snippet:

Jan 26, 2015 – Health• Diabetes Prevention: Multiple studies have shown a correlation between apple cider vinegar and lower blood sugar levels. … • Weight Loss: Consuming apple cider vinegar can help you feel more full, which can help you eat less. … • Lower Cholesterol: … • Detox: … • Digestive Aid: … • Itchy or Sunburned Skin: … • Energy Boost:1 more items

So, what about the meta description? Here’s what we actually see in the tag:

Were you aware of all the uses of apple cider vinegar? From cleansing to healing, to preventing diabetes, ACV is a pantry staple you need in your home.

That’s a bit more than just a couple of edits. So, what’s happening here? Well, there’s a clue on that same page, where we see yet another rule-breaking snippet:

You might be wondering why this snippet is any more interesting than the other one. If you could see the top of the SERP, you’d know why, because it looks something like this:

Google is automatically extracting list-style data from these pages to fuel the expansion of the Knowledge Graph. In one case, that data is replacing a snippet
and going directly into an answer box, but they’re performing the same translation even for some other snippets on the page.

So, does every 2nd-generation answer box yield long snippets? After 3 hours of inadvisable mySQL queries, I can tell you that the answer is a resounding “probably not”. You can have 2nd-gen answer boxes without long snippets and you can have long snippets without 2nd-gen answer boxes,
but there does appear to be a connection between long snippets and Knowledge Graph in some cases.

One interesting connection is that Google has begun bolding keywords that seem like answers to the query (and not just synonyms for the query). Below is an example from a search for “mono symptoms”. There’s an answer box for this query, but the snippet below is not from the site in the answer box:

Notice the bolded words – “fatigue”, “sore throat”, “fever”, “headache”, “rash”. These aren’t synonyms for the search phrase; these are actual symptoms of mono. This data isn’t coming from the meta description, but from a bulleted list on the target page. Again, it appears that Google is trying to use the snippet to answer a question, and has gone well beyond just matching keywords.

Just for fun, let’s look at one more, where there’s no clear connection to the Knowledge Graph. Here’s a snippet from a search for “sons of anarchy season 4”:

This page has no answer box, and the information extracted is odd at best. The snippet bears little or no resemblance to the site’s meta description. The number string at the beginning comes out of a rating widget, and some of the text isn’t even clearly available on the page. This seems to be an example of Google acknowledging IMDb as a high-authority site and desperately trying to match any text they can to the query, resulting in a Frankenstein’s snippet.

The Final Verdict

If all of this seems confusing, that’s probably because it is. Google is taking a lot more liberties with snippets these days, both to better match queries, to add details they feel are important, or to help build and support the Knowledge Graph.

So, let’s get back to the original question – is it time to revise the 155(ish) character guideline? My gut feeling is: not yet. To begin with, the vast majority of snippets are still falling in that 145-165 character range. In addition, the exceptions to the rule are not only atypical situations, but in most cases those long snippets don’t seem to represent the original meta description. In other words, even if Google does grant you extra characters, they probably won’t be the extra characters you asked for in the first place.

Many people have asked: “How do I make sure that Google shows my meta description as is?” I’m afraid the answer is: “You don’t.” If this is very important to you, I would recommend keeping your description below the 155-character limit, and making sure that it’s a good match to your target keyword concepts. I suspect Google is going to take more liberties with snippets over time, and we’re going to have to let go of our obsession with having total control over the SERPs.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

Headline Writing and Title Tag SEO in a Clickbait World – Whiteboard Friday

Posted by randfish

When writing headlines and title tags, we’re often conflicted in what we’re trying to say and (more to the point) how we’re trying to say it. Do we want it to help the page rank in SERPs? Do we want people to be intrigued enough to click through? Or are we trying to best satisfy the searcher’s intent? We’d like all three, but a headline that achieves them all is incredibly difficult to write.

In today’s Whiteboard Friday, Rand illustrates just how small the intersection of those goals is, and offers a process you can use to find the best way forward.

For reference, here’s a still of this week’s whiteboard!

Video transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re going to chat about writing titles and headlines, both for SEO and in this new click-bait, Facebook social world. This is kind of a challenge, because I think many folks are seeing and observing that a lot of the ranking signals that can help a page perform well are often preceded or well correlated with social activity, which would kind of bias us towards saying, “Hey, how can I do these click-baity, link-baity sorts of social viral pieces,” versus we’re also a challenge with, “Gosh, those things aren’t as traditionally well performing in search results from a perhaps click-through rate and certainly from a search conversion perspective. So how do we balance out these two and make them work together for us based on our marketing goals?” So I want to try and help with that.

Let’s look at a search query for Viking battles, in Google. These are the top two results. One is from Wikipedia. It’s a category page — Battles Involving the Vikings. That’s pretty darn straightforward. But then our second result — actually this might be a third result, I think there’s a indented second Wikipedia result — is the seven most bad ass last stands in the history of battles. It turns out that there happen to be a number of Viking related battles in there, and you can see that in the meta description that Google pulls. This one’s from Crack.com.

These are pretty representative of the two different kinds of results or of content pieces that I’m talking about. One is very, very viral, very social focused, clearly designed to sort of do well in the Facebook world. One is much more classic search focused, clearly designed to help answer the user query — here’s a list of Viking battles and their prominence and importance in history, and structure, and all those kinds of things.

Okay. Here’s another query — Viking jewelry. Going to stick with my Viking theme, because why not? We can see a website from Viking jewelry. This one’s on JellDragon.com. It’s an eCommerce site. They’re selling sterling silver and bronze Viking jewelry. They’ve actually done very classic SEO focus. Not only do they have Viking jewelry mentioned twice, in the second instance of Viking jewelry, I think they’ve intentionally — I hope it was intentionally — misspelled the word “jewelry” to hopefully catch misspellings. That’s some old-school SEO. I would actually not recommend this for any purpose.

But I thought it was interesting to highlight versus in this search result it takes until page three until I could really find a viral, social, targeted, more link-baity, click-baity type of article, this one from io9 — 1,000 Year-old Viking Jewelry Found On Danish Farm. You know what the interesting part is? In this case, both of these are on powerful domains. They both have quite a few links to them from many external sources. They’re pretty well SEO’d pages.

In this case, the first two pages of results are all kind of small jewelry website stores and a few results from like Etsy and Amazon, more powerful authoritative domains. But it really takes a long time before you get these, what I’d consider, very powerful, very strong attempts at ranking for Viking jewelry from more of your click-bait, social, headline, viral sites. io9 certainly, I would kind of expect them to perform higher, except that this doesn’t serve the searcher intent.

I think Google knows that when people look for Viking jewelry, they’re not looking for the history of Viking jewelry or where recent archeological finds of Viking jewelry happened. They’re looking specifically for eCommerce sites. They’re trying to transact and buy, or at least view and see what Viking jewelry looks like. So they’re looking for photo heavy, visual heavy, potentially places where they might buy stuff. Maybe it’s some people looking for artifacts as well, to view the images of those, but less of the click-bait focus kind of stuff.

This one I think it’s very likely that this does indeed perform well for this search query, and lots of people do click on that as a positive result for what they’re looking for from Viking battles, because they’d like to see, “Okay, what were the coolest, most amazing Viking battles that happened in history?”

You can kind of see what’s happened here with two things. One is with Hummingbird and Google’s focus on topic modeling, and the other with searcher intent and how Google has gotten so incredibly good at pattern matching to serve user intent. This is really important from an SEO perspective to understand as well, and I like how these two examples highlight it. One is saying, “Hey, just because you have the most links, the strongest domain, the best keyword targeting, doesn’t necessarily mean you’ll rank if you’re not serving searcher intent.”

Now, when we think about doing this for ourselves, that click-bait versus searched optimized experience for our content, what is it about? It’s really about choosing. It’s about choosing searcher intent, our website and marketing goals, or click-bait types of goals. I’ve visualized the intersection here with a Venn diagram. So these in pink here, the click-bait pieces that are going to resonate in social media — Facebook, Twitter, etc. Blue is the intent of searchers, and purple is your marketing goals, what you want to achieve when visitors get to your site, the reason you’re trying to attract this traffic in the first place.

This intersection, as you will notice, is super, uber tiny. It is miniscule. It is molecule sized, and it’s a very, very hard intersection to hit. In fact, for the vast majority of content pieces, I’m going to say that it’s going to be close to, not always, but close to impossible to get that perfect mix of click-bait, intent of searchers, and your marketing goals. The times when it works best is really when you’re trying to educate your audience or provide them with informational value, and that’s also something that’s going to resonate in the social web and something searchers are going to be looking for. It works pretty well in B2B types of things, particularly in spaces where there’s lots of influencers and amplifiers who also care about educating their followers. It doesn’t work so well when you’re trying to target Viking battles or Viking jewelry. What can I say, the historians of the Viking world simply aren’t that huge on Twitter yet. I hope they will be one day.

This is kind of the process that I would use to think about the structure of these and how to choose between them. First off, I think you need to ask, “Should I create a single piece of content to target all of these, or should I instead be thinking about individual pieces that hit one or two at a time?”

So it could be the case that maybe you’ve got an intersection of intent for searchers and your marketing goals. This happens quite a bit, and oftentimes for these folks, for the Jell Dragon Viking Jewelry, the intent of searchers and what they’re trying to accomplish on their site, perfectly in harmony, but definitely not with click-bait pieces that are going to resonate on the web. More challenging for io9 with this kind of a thing, because searchers just aren’t looking for that around Viking jewelry. They might instead be thinking about, “Hey, we’re trying to target the specific news item. We want anyone who looks for Viking jewelry in Danish farm, or Viking jewelry found, or those kind of things to be finding our site.”

Then, I would ask, “How can I best serve my own marketing goals, the marketing goals of my website through the pages that are targeted at search or social?” Sometimes that’s going to be very direct, like it is over here with JellDagon.com trying to convert folks and folks looking for Viking jewelry to buy.

Sometimes it’s going to be indirect,. A Moz Whiteboard Friday, for example, is a very indirect example. We’re trying to serve the intent of searchers and in the long term eventually, maybe sometime in the future some folks who watch this video might be interested in Moz’ tools or going to MozCon or signing up for an email list, or whatever it is. But our marketing goals are secondary and they’re further in the future. You could also think about that happening at the very end of a funnel, coming in if someone searches for say Moz versus Searchmetrics and maybe Searchmetrics has a great page comparing what’s better about their service versus Moz’ service and those types of things, and getting right in at the end of the funnel. So that should be a consideration as well. Same thing with social.

Then lastly, where are you going to focus that keyword targeting and the content foci efforts? What kind of content are you going to build? How are you going to keyword target them best to achieve this, and how much you interlink between those pages?

I’ll give you a quick example over here, but this can be expanded upon. So for my conversion page, I may try and target the same keywords or a slightly more commercial variation on the search terms I’m targeting with my more informational style content versus entertainment social style content. Then, conversion page might be separate, depending on how I’m structuring things and what the intent of searchers is. My click-bait piece may be not very keyword focused at all. I might write that headline and say, “I don’t care about the keywords at all. I don’t need to rank here. I’m trying to go viral on social media. I’m trying to achieve my click-bait goals. My goal is to drive traffic, get some links, get some topical authority around this subject matter, and later hopefully rank with this page or maybe even this page in search engines.” That’s a viable goal as well.

When you do that, what you want to do then is have a link structure that optimizes around this. So your click-bait piece, a lot of times with click-bait pieces they’re going to perform worse if you go over and try and link directly to your conversion page, because it looks like you’re trying to sell people something. That’s not what plays on Facebook, on Twitter, on social media in general. What plays is, “Hey, this is just entertainment, and I can just visit this piece and it’s fun and funny and interesting.”

What plays well in search, however, is something that let’s someone accomplish their tasks. So it’s fine to have information and then a call to action, and that call to action can point to the conversion page. The click-bait pieces content can do a great job of helping to send link equity, ranking signals, and maybe some visitor traffic who’s interested in truly learning more over to the informational page that you want ranking for search. This is kind of a beautiful way to think about the interaction between the three of these when you have these different levels of foci, when you have these different searcher versus click-bait intents, and how to bring them all together.

All right everyone, hope to see you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

12 Common Reasons Reconsideration Requests Fail

Posted by Modestos

There are several reasons a reconsideration request might fail. But some of the most common mistakes site owners and inexperienced SEOs make when trying to lift a link-related Google penalty are entirely avoidable. 

Here’s a list of the top 12 most common mistakes made when submitting reconsideration requests, and how you can prevent them.

1. Insufficient link data

This is one of the most common reasons why reconsideration requests fail. This mistake is readily evident each time a reconsideration request gets rejected and the example URLs provided by Google are unknown to the webmaster. Relying only on Webmaster Tools data isn’t enough, as Google has repeatedly said. You need to combine data from as many different sources as possible. 

A good starting point is to collate backlink data, at the very least:

  • Google Webmaster Tools (both latest and sample links)
  • Bing Webmaster Tools
  • Majestic SEO (Fresh Index)
  • Ahrefs
  • Open Site Explorer

If you use any toxic link-detection services (e.g., Linkrisk and Link Detox), then you need to take a few precautions to ensure the following:

  • They are 100% transparent about their backlink data sources
  • They have imported all backlink data
  • You can upload your own backlink data (e.g., Webmaster Tools) without any limitations

If you work on large websites that have tons of backlinks, most of these automated services are very likely used to process just a fraction of the links, unless you pay for one of their premium packages. If you have direct access to the above data sources, it’s worthwhile to download all backlink data, then manually upload it into your tool of choice for processing. This is the only way to have full visibility over the backlink data that has to be analyzed and reviewed later. Starting with an incomplete data set at this early (yet crucial) stage could seriously hinder the outcome of your reconsideration request.

2. Missing vital legacy information

The more you know about a site’s history and past activities, the better. You need to find out (a) which pages were targeted in the past as part of link building campaigns, (b) which keywords were the primary focus and (c) the link building tactics that were scaled (or abused) most frequently. Knowing enough about a site’s past activities, before it was penalized, can help you home in on the actual causes of the penalty. Also, collect as much information as possible from the site owners.

3. Misjudgement

Misreading your current situation can lead to wrong decisions. One common mistake is to treat the example URLs provided by Google as gospel and try to identify only links with the same patterns. Google provides a very small number of examples of unnatural links. Often, these examples are the most obvious and straightforward ones. However, you should look beyond these examples to fully address the issues and take the necessary actions against all types of unnatural links. 

Google is very clear on the matter: “Please correct or remove all inorganic links, not limited to the samples provided above.

Another common area of bad judgement is the inability to correctly identify unnatural links. This is a skill that requires years of experience in link auditing, as well as link building. Removing the wrong links won’t lift the penalty, and may also result in further ranking drops and loss of traffic. You must remove the right links.


4. Blind reliance on tools

There are numerous unnatural link-detection tools available on the market, and over the years I’ve had the chance to try out most (if not all) of them. Because (and without any exception) I’ve found them all very ineffective and inaccurate, I do not rely on any such tools for my day-to-day work. In some cases, a lot of the reported “high risk” links were 100% natural links, and in others, numerous toxic links were completely missed. If you have to manually review all the links to discover the unnatural ones, ensuring you don’t accidentally remove any natural ones, it makes no sense to pay for tools. 

If you solely rely on automated tools to identify the unnatural links, you will need a miracle for your reconsideration request to be successful. The only tool you really need is a powerful backlink crawler that can accurately report the current link status of each URL you have collected. You should then manually review all currently active links and decide which ones to remove. 

I could write an entire book on the numerous flaws and bugs I have come across each time I’ve tried some of the most popular link auditing tools. A lot of these issues can be detrimental to the outcome of the reconsideration request. I have seen many reconsiderations request fail because of this. If Google cannot algorithmically identify all unnatural links and must operate entire teams of humans to review the sites (and their links), you shouldn’t trust a $99/month service to identify the unnatural links.

If you have an in-depth understanding of Google’s link schemes, you can build your own process to prioritize which links are more likely to be unnatural, as I described in this post (see sections 7 & 8). In an ideal world, you should manually review every single link pointing to your site. Where this isn’t possible (e.g., when dealing with an enormous numbers of links or resources are unavailable), you should at least focus on the links that have the more “unnatural” signals and manually review them.

5. Not looking beyond direct links

When trying to lift a link-related penalty, you need to look into all the links that may be pointing to your site directly or indirectly. Such checks include reviewing all links pointing to other sites that have been redirected to your site, legacy URLs with external inbound links that have been internally redirected owned, and third-party sites that include cross-domain canonicals to your site. For sites that used to buy and redirect domains in order increase their rankings, the quickest solution is to get rid of the redirects. Both Majestic SEO and Ahrefs report redirects, but some manual digging usually reveals a lot more.

PQPkyj0.jpg

6. Not looking beyond the first link

All major link intelligence tools, including Majestic SEO, Ahrefs and Open Site Explorer, report only the first link pointing to a given site when crawling a page. This means that, if you overly rely on automated tools to identify links with commercial keywords, the vast majority of them will only take into consideration the first link they discover on a page. If a page on the web links just once to your site, this is not big deal. But if there are multiple links, the tools will miss all but the first one.

For example, if a page has five different links pointing to your site, and the first one includes a branded anchor text, these tools will just report the first link. Most of the link-auditing tools will in turn evaluate the link as “natural” and completely miss the other four links, some of which may contain manipulative anchor text. The more links that get missed this way the more likely your reconsideration request will fail.

7. Going too thin

Many SEOs and webmasters (still) feel uncomfortable with the idea of losing links. They cannot accept the idea of links that once helped their rankings are now being devalued, and must be removed. There is no point trying to save “authoritative”, unnatural links out of fear of losing rankings. If the main objective is to lift the penalty, then all unnatural links need to be removed.

Often, in the first reconsideration request, SEOs and site owners tend to go too thin, and in the subsequent attempts start cutting deeper. If you are already aware of the unnatural links pointing to your site, try to get rid of them from the very beginning. I have seen examples of unnatural links provided by Google on PR 9/DA 98 sites. Metrics do not matter when it comes to lifting a penalty. If a link is manipulative, it has to go.

In any case, Google’s decision won’t be based only on the number of links that have been removed. Most important in the search giant’s eyes are the quality of links still pointing to your site. If the remaining links are largely of low quality, the reconsideration request will almost certainly fail. 

8. Insufficient effort to remove links

Google wants to see a “good faith” effort to get as many links removed as possible. The higher the percentage of unnatural links removed, the better. Some agencies and SEO consultants tend to rely too much on the use of the disavow tool. However, this isn’t a panacea, and should be used as a last resort for removing those links that are impossible to remove—after exhausting all possibilities to physically remove them via the time-consuming (yet necessary) outreach route. 

Google is very clear on this:

m4M4n3g.jpg?1

Even if you’re unable to remove all of the links that need to be removed, you must be able to demonstrate that you’ve made several attempts to have them removed, which can have a favorable impact on the outcome of the reconsideration request. Yes, in some cases it might be possible to have a penalty lifted simply by disavowing instead of removing the links, but these cases are rare and this strategy may backfire in the future. When I reached out to ex-googler Fili Wiese’s for some advice on the value of removing the toxic links (instead of just disavowing them), his response was very straightforward:

V3TmCrj.jpg 

9. Ineffective outreach

Simply identifying the unnatural links won’t get the penalty lifted unless a decent percentage of the links have been successfully removed. The more communication channels you try, the more likely it is that you reach the webmaster and get the links removed. Sending the same email hundreds or thousands of times is highly unlikely to result in a decent response rate. Trying to remove a link from a directory is very different from trying to get rid of a link appearing in a press release, so you should take a more targeted approach with a well-crafted, personalized email. Link removal request emails must be honest and to the point, or else they’ll be ignored.

Tracking the emails will also help in figuring out which messages have been read, which webmasters might be worth contacting again, or alert you of the need to try an alternative means of contacting webmasters.

Creativity, too, can play a big part in the link removal process. For example, it might be necessary to use social media to reach the right contact. Again, don’t trust automated emails or contact form harvesters. In some cases, these applications will pull in any email address they find on the crawled page (without any guarantee of who the information belongs to). In others, they will completely miss masked email addresses or those appearing in images. If you really want to see that the links are removed, outreach should be carried out by experienced outreach specialists. Unfortunately, there aren’t any shortcuts to effective outreach.

10. Quality issues and human errors

All sorts of human errors can occur when filing a reconsideration request. The most common errors include submitting files that do not exist, files that do not open, files that contain incomplete data, and files that take too long to load. You need to triple-check that the files you are including in your reconsideration request are read-only, and that anyone with the URL can fully access them. 

Poor grammar and language is also bad practice, as it may be interpreted as “poor effort.” You should definitely get the reconsideration request proofread by a couple of people to be sure it is flawless. A poorly written reconsideration request can significantly hinder your overall efforts.

Quality issues can also occur with the disavow file submission. Disavowing at the URL level isn’t recommended because the link(s) you want to get rid of are often accessible to search engines via several URLs you may be unaware of. Therefore, it is strongly recommended that you disavow at the domain or sub-domain level.

11. Insufficient evidence

How does Google know you have done everything you claim in your reconsideration request? Because you have to prove each claim is valid, you need to document every single action you take, from sent emails and submitted forms, to social media nudges and phone calls. The more information you share with Google in your reconsideration request, the better. This is the exact wording from Google:

“ …we will also need to see good-faith efforts to remove a large portion of inorganic links from the web wherever possible.”

12. Bad communication

How you communicate your link cleanup efforts is as essential as the work you are expected to carry out. Not only do you need to explain the steps you’ve taken to address the issues, but you also need to share supportive information and detailed evidence. The reconsideration request is the only chance you have to communicate to Google which issues you have identified, and what you’ve done to address them. Being honest and transparent is vital for the success of the reconsideration request.

There is absolutely no point using the space in a reconsideration request to argue with Google. Some of the unnatural links examples they share may not always be useful (e.g., URLs that include nofollow links, removed links, or even no links at all). But taking the argumentative approach veritably guarantees your request will be denied.

54adb6e0227790.04405594.jpg
Cropped from photo by Keith Allison, licensed under Creative Commons.

Conclusion

Getting a Google penalty lifted requires a good understanding of why you have been penalized, a flawless process and a great deal of hands-on work. Performing link audits for the purpose of lifting a penalty can be very challenging, and should only be carried out by experienced consultants. If you are not 100% sure you can take all the required actions, seek out expert help rather than looking for inexpensive (and ineffective) automated solutions. Otherwise, you will almost certainly end up wasting weeks or months of your precious time, and in the end, see your request denied.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]