Big Data, Big Problems: 4 Major Link Indexes Compared

Posted by russangular

Given this blog’s readership, chances are good you will spend some time this week looking at backlinks in one of the growing number of link data tools. We know backlinks continue to be one of, if not the most important
parts of Google’s ranking algorithm. We tend to take these link data sets at face value, though, in part because they are all we have. But when your rankings are on the line, is there a better way to get at which data set is the best? How should we go
about assessing these different link indexes like
Moz,
Majestic, Ahrefs and SEMrush for quality? Historically, there have been 4 common approaches to this question of index quality…

  • Breadth: We might choose to look at the number of linking root domains any given service reports. We know
    that referring domains correlates strongly with search rankings, so it makes sense to judge a link index by how many unique domains it has
    discovered and indexed.
  • Depth: We also might choose to look at how deep the web has been crawled, looking more at the total number of URLs
    in the index, rather than the diversity of referring domains.
  • Link Overlap: A more sophisticated approach might count the number of links an index has in common with Google Webmaster
    Tools.
  • Freshness: Finally, we might choose to look at the freshness of the index. What percentage of links in the index are
    still live?

There are a number of really good studies (some newer than others) using these techniques that are worth checking out when you get a chance:

  • BuiltVisible analysis of Moz, Majestic, GWT, Ahrefs and Search Metrics
  • SEOBook comparison of Moz, Majestic, Ahrefs, and Ayima
  • MatthewWoodward
    study of Ahrefs, Majestic, Moz, Raven and SEO Spyglass
  • Marketing Signals analysis of Moz, Majestic, Ahrefs, and GWT
  • RankAbove comparison of Moz, Majestic, Ahrefs and Link Research Tools
  • StoneTemple study of Moz and Majestic

While these are all excellent at addressing the methodologies above, there is a particular limitation with all of them. They miss one of the
most important metrics we need to determine the value of a link index: proportional representation to Google’s link graph
. So here at Angular Marketing, we decided to take a closer look.

Proportional representation to Google Search Console data

So, why is it important to determine proportional representation? Many of the most important and valued metrics we use are built on proportional
models. PageRank, MozRank, CitationFlow and Ahrefs Rank are proportional in nature. The score of any one URL in the data set is relative to the
other URLs in the data set. If the data set is biased, the results are biased.

A Visualization

Link graphs are biased by their crawl prioritization. Because there is no full representation of the Internet, every link graph, even Google’s,
is a biased sample of the web. Imagine for a second that the picture below is of the web. Each dot represents a page on the Internet,
and the dots surrounded by green represent a fictitious index by Google of certain sections of the web.

Of course, Google isn’t the only organization that crawls the web. Other organizations like Moz,
Majestic, Ahrefs, and SEMrush
have their own crawl prioritizations which result in different link indexes.

In the example above, you can see different link providers trying to index the web like Google. Link data provider 1 (purple) does a good job
of building a model that is similar to Google. It isn’t very big, but it is proportional. Link data provider 2 (blue) has a much larger index,
and likely has more links in common with Google that link data provider 1, but it is highly disproportional. So, how would we go about measuring
this proportionality? And which data set is the most proportional to Google?

Methodology

The first step is to determine a measurement of relativity for analysis. Google doesn’t give us very much information about their link graph.
All we have is what is in Google Search Console. The best source we can use is referring domain counts. In particular, we want to look at
what we call
referring domain link pairs. A referring domain link pair would be something like ask.com->mlb.com: 9,444 which means
that ask.com links to mlb.com 9,444 times.

Steps

  1. Determine the root linking domain pairs and values to 100+ sites in Google Search Console
  2. Determine the same for Ahrefs, Moz, Majestic Fresh, Majestic Historic, SEMrush
  3. Compare the referring domain link pairs of each data set to Google, assuming a
    Poisson Distribution
  4. Run simulations of each data set’s performance against each other (ie: Moz vs Maj, Ahrefs vs SEMrush, Moz vs SEMrush, et al.)
  5. Analyze the results

Results

When placed head-to-head, there seem to be some clear winners at first glance. In head-to-head, Moz edges out Ahrefs, but across the board, Moz and Ahrefs fare quite evenly. Moz, Ahrefs and SEMrush seem to be far better than Majestic Fresh and Majestic Historic. Is that really the case? And why?

It turns out there is an inversely proportional relationship between index size and proportional relevancy. This might seem counterintuitive,
shouldn’t the bigger indexes be closer to Google? Not Exactly.

What does this mean?

Each organization has to create a crawl prioritization strategy. When you discover millions of links, you have to prioritize which ones you
might crawl next. Google has a crawl prioritization, so does Moz, Majestic, Ahrefs and SEMrush. There are lots of different things you might
choose to prioritize…

  • You might prioritize link discovery. If you want to build a very large index, you could prioritize crawling pages on sites that
    have historically provided new links.
  • You might prioritize content uniqueness. If you want to build a search engine, you might prioritize finding pages that are unlike
    any you have seen before. You could choose to crawl domains that historically provide unique data and little duplicate content.
  • You might prioritize content freshness. If you want to keep your search engine recent, you might prioritize crawling pages that
    change frequently.
  • You might prioritize content value, crawling the most important URLs first based on the number of inbound links to that page.

Chances are, an organization’s crawl priority will blend some of these features, but it’s difficult to design one exactly like Google. Imagine
for a moment that instead of crawling the web, you want to climb a tree. You have to come up with a tree climbing strategy.

  • You decide to climb the longest branch you see at each intersection.
  • One friend of yours decides to climb the first new branch he reaches, regardless of how long it is.
  • Your other friend decides to climb the first new branch she reaches only if she sees another branch coming off of it.

Despite having different climb strategies, everyone chooses the same first branch, and everyone chooses the same second branch. There are only
so many different options early on.

But as the climbers go further and further along, their choices eventually produce differing results. This is exactly the same for web crawlers
like Google, Moz, Majestic, Ahrefs and SEMrush. The bigger the crawl, the more the crawl prioritization will cause disparities. This is not a
deficiency; this is just the nature of the beast. However, we aren’t completely lost. Once we know how index size is related to disparity, we
can make some inferences about how similar a crawl priority may be to Google.

Unfortunately, we have to be careful in our conclusions. We only have a few data points with which to work, so it is very difficult to be
certain regarding this part of the analysis. In particular, it seems strange that Majestic would get better relative to its index size as it grows,
unless Google holds on to old data (which might be an important discovery in and of itself). It is most likely that at this point we can’t make
this level of conclusion.

So what do we do?

Let’s say you have a list of domains or URLs for which you would like to know their relative values. Your process might look something like
this…

  • Check Open Site Explorer to see if all URLs are in their index. If so, you are looking metrics most likely to be proportional to Google’s link graph.
  • If any of the links do not occur in the index, move to Ahrefs and use their Ahrefs ranking if all you need is a single PageRank-like metric.
  • If any of the links are missing from Ahrefs’s index, or you need something related to trust, move on to Majestic Fresh.
  • Finally, use Majestic Historic for (by leaps and bounds) the largest coverage available.

It is important to point out that the likelihood that all the URLs you want to check are in a single index increases as the accuracy of the metric
decreases. Considering the size of Majestic’s data, you can’t ignore them because you are less likely to get null value answers from their data than
the others. If anything rings true, it is that once again it makes sense to get data
from as many sources as possible. You won’t
get the most proportional data without Moz, the broadest data without Majestic, or everything in-between without Ahrefs.

What about SEMrush? They are making progress, but they don’t publish any relative statistics that would be useful in this particular
case. Maybe we can hope to see more from them soon given their already promising index!

Recommendations for the link graphing industry

All we hear about these days is big data; we almost never hear about good data. I know that the teams at Moz,
Majestic, Ahrefs, SEMrush and others are interested in mimicking Google, but I would love to see some organization stand up against the
allure of
more data in favor of better data—data more like Google’s. It could begin with testing various crawl strategies to see if they produce
a result more similar to that of data shared in Google Search Console. Having the most Google-like data is certainly a crown worth winning.

Credits

Thanks to Diana Carter at Angular for assistance with data acquisition and Andrew Cron with statistical analysis. Thanks also to the representatives from Moz, Majestic, Ahrefs, and SEMrush for answering questions about their indices.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Should I Use Relative or Absolute URLs? – Whiteboard Friday

Posted by RuthBurrReedy

It was once commonplace for developers to code relative URLs into a site. There are a number of reasons why that might not be the best idea for SEO, and in today’s Whiteboard Friday, Ruth Burr Reedy is here to tell you all about why.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

Let’s discuss some non-philosophical absolutes and relatives

Howdy, Moz fans. My name is Ruth Burr Reedy. You may recognize me from such projects as when I used to be the Head of SEO at Moz. I’m now the Senior SEO Manager at BigWing Interactive in Oklahoma City. Today we’re going to talk about relative versus absolute URLs and why they are important.

At any given time, your website can have several different configurations that might be causing duplicate content issues. You could have just a standard http://www.example.com. That’s a pretty standard format for a website.

But the main sources that we see of domain level duplicate content are when the non-www.example.com does not redirect to the www or vice-versa, and when the HTTPS versions of your URLs are not forced to resolve to HTTP versions or, again, vice-versa. What this can mean is if all of these scenarios are true, if all four of these URLs resolve without being forced to resolve to a canonical version, you can, in essence, have four versions of your website out on the Internet. This may or may not be a problem.

It’s not ideal for a couple of reasons. Number one, duplicate content is a problem because some people think that duplicate content is going to give you a penalty. Duplicate content is not going to get your website penalized in the same way that you might see a spammy link penalty from Penguin. There’s no actual penalty involved. You won’t be punished for having duplicate content.

The problem with duplicate content is that you’re basically relying on Google to figure out what the real version of your website is. Google is seeing the URL from all four versions of your website. They’re going to try to figure out which URL is the real URL and just rank that one. The problem with that is you’re basically leaving that decision up to Google when it’s something that you could take control of for yourself.

There are a couple of other reasons that we’ll go into a little bit later for why duplicate content can be a problem. But in short, duplicate content is no good.

However, just having these URLs not resolve to each other may or may not be a huge problem. When it really becomes a serious issue is when that problem is combined with injudicious use of relative URLs in internal links. So let’s talk a little bit about the difference between a relative URL and an absolute URL when it comes to internal linking.

With an absolute URL, you are putting the entire web address of the page that you are linking to in the link. You’re putting your full domain, everything in the link, including /page. That’s an absolute URL.

However, when coding a website, it’s a fairly common web development practice to instead code internal links with what’s called a relative URL. A relative URL is just /page. Basically what that does is it relies on your browser to understand, “Okay, this link is pointing to a page that’s on the same domain that we’re already on. I’m just going to assume that that is the case and go there.”

There are a couple of really good reasons to code relative URLs

1) It is much easier and faster to code.

When you are a web developer and you’re building a site and there thousands of pages, coding relative versus absolute URLs is a way to be more efficient. You’ll see it happen a lot.

2) Staging environments

Another reason why you might see relative versus absolute URLs is some content management systems — and SharePoint is a great example of this — have a staging environment that’s on its own domain. Instead of being example.com, it will be examplestaging.com. The entire website will basically be replicated on that staging domain. Having relative versus absolute URLs means that the same website can exist on staging and on production, or the live accessible version of your website, without having to go back in and recode all of those URLs. Again, it’s more efficient for your web development team. Those are really perfectly valid reasons to do those things. So don’t yell at your web dev team if they’ve coded relative URLS, because from their perspective it is a better solution.

Relative URLs will also cause your page to load slightly faster. However, in my experience, the SEO benefits of having absolute versus relative URLs in your website far outweigh the teeny-tiny bit longer that it will take the page to load. It’s very negligible. If you have a really, really long page load time, there’s going to be a whole boatload of things that you can change that will make a bigger difference than coding your URLs as relative versus absolute.

Page load time, in my opinion, not a concern here. However, it is something that your web dev team may bring up with you when you try to address with them the fact that, from an SEO perspective, coding your website with relative versus absolute URLs, especially in the nav, is not a good solution.

There are even better reasons to use absolute URLs

1) Scrapers

If you have all of your internal links as relative URLs, it would be very, very, very easy for a scraper to simply scrape your whole website and put it up on a new domain, and the whole website would just work. That sucks for you, and it’s great for that scraper. But unless you are out there doing public services for scrapers, for some reason, that’s probably not something that you want happening with your beautiful, hardworking, handcrafted website. That’s one reason. There is a scraper risk.

2) Preventing duplicate content issues

But the other reason why it’s very important to have absolute versus relative URLs is that it really mitigates the duplicate content risk that can be presented when you don’t have all of these versions of your website resolving to one version. Google could potentially enter your site on any one of these four pages, which they’re the same page to you. They’re four different pages to Google. They’re the same domain to you. They are four different domains to Google.

But they could enter your site, and if all of your URLs are relative, they can then crawl and index your entire domain using whatever format these are. Whereas if you have absolute links coded, even if Google enters your site on www. and that resolves, once they crawl to another page, that you’ve got coded without the www., all of that other internal link juice and all of the other pages on your website, Google is not going to assume that those live at the www. version. That really cuts down on different versions of each page of your website. If you have relative URLs throughout, you basically have four different websites if you haven’t fixed this problem.

Again, it’s not always a huge issue. Duplicate content, it’s not ideal. However, Google has gotten pretty good at figuring out what the real version of your website is.

You do want to think about internal linking, when you’re thinking about this. If you have basically four different versions of any URL that anybody could just copy and paste when they want to link to you or when they want to share something that you’ve built, you’re diluting your internal links by four, which is not great. You basically would have to build four times as many links in order to get the same authority. So that’s one reason.

3) Crawl Budget

The other reason why it’s pretty important not to do is because of crawl budget. I’m going to point it out like this instead.

When we talk about crawl budget, basically what that is, is every time Google crawls your website, there is a finite depth that they will. There’s a finite number of URLs that they will crawl and then they decide, “Okay, I’m done.” That’s based on a few different things. Your site authority is one of them. Your actual PageRank, not toolbar PageRank, but how good Google actually thinks your website is, is a big part of that. But also how complex your site is, how often it’s updated, things like that are also going to contribute to how often and how deep Google is going to crawl your site.

It’s important to remember when we think about crawl budget that, for Google, crawl budget cost actual dollars. One of Google’s biggest expenditures as a company is the money and the bandwidth it takes to crawl and index the Web. All of that energy that’s going into crawling and indexing the Web, that lives on servers. That bandwidth comes from servers, and that means that using bandwidth cost Google actual real dollars.

So Google is incentivized to crawl as efficiently as possible, because when they crawl inefficiently, it cost them money. If your site is not efficient to crawl, Google is going to save itself some money by crawling it less frequently and crawling to a fewer number of pages per crawl. That can mean that if you have a site that’s updated frequently, your site may not be updating in the index as frequently as you’re updating it. It may also mean that Google, while it’s crawling and indexing, may be crawling and indexing a version of your website that isn’t the version that you really want it to crawl and index.

So having four different versions of your website, all of which are completely crawlable to the last page, because you’ve got relative URLs and you haven’t fixed this duplicate content problem, means that Google has to spend four times as much money in order to really crawl and understand your website. Over time they’re going to do that less and less frequently, especially if you don’t have a really high authority website. If you’re a small website, if you’re just starting out, if you’ve only got a medium number of inbound links, over time you’re going to see your crawl rate and frequency impacted, and that’s bad. We don’t want that. We want Google to come back all the time, see all our pages. They’re beautiful. Put them up in the index. Rank them well. That’s what we want. So that’s what we should do.

There are couple of ways to fix your relative versus absolute URLs problem

1) Fix what is happening on the server side of your website

You have to make sure that you are forcing all of these different versions of your domain to resolve to one version of your domain. For me, I’m pretty agnostic as to which version you pick. You should probably already have a pretty good idea of which version of your website is the real version, whether that’s www, non-www, HTTPS, or HTTP. From my view, what’s most important is that all four of these versions resolve to one version.

From an SEO standpoint, there is evidence to suggest and Google has certainly said that HTTPS is a little bit better than HTTP. From a URL length perspective, I like to not have the www. in there because it doesn’t really do anything. It just makes your URLs four characters longer. If you don’t know which one to pick, I would pick one this one HTTPS, no W’s. But whichever one you pick, what’s really most important is that all of them resolve to one version. You can do that on the server side, and that’s usually pretty easy for your dev team to fix once you tell them that it needs to happen.

2) Fix your internal links

Great. So you fixed it on your server side. Now you need to fix your internal links, and you need to recode them for being relative to being absolute. This is something that your dev team is not going to want to do because it is time consuming and, from a web dev perspective, not that important. However, you should use resources like this Whiteboard Friday to explain to them, from an SEO perspective, both from the scraper risk and from a duplicate content standpoint, having those absolute URLs is a high priority and something that should get done.

You’ll need to fix those, especially in your navigational elements. But once you’ve got your nav fixed, also pull out your database or run a Screaming Frog crawl or however you want to discover internal links that aren’t part of your nav, and make sure you’re updating those to be absolute as well.

Then you’ll do some education with everybody who touches your website saying, “Hey, when you link internally, make sure you’re using the absolute URL and make sure it’s in our preferred format,” because that’s really going to give you the most bang for your buck per internal link. So do some education. Fix your internal links.

Sometimes your dev team going to say, “No, we can’t do that. We’re not going to recode the whole nav. It’s not a good use of our time,” and sometimes they are right. The dev team has more important things to do. That’s okay.

3) Canonicalize it!

If you can’t get your internal links fixed or if they’re not going to get fixed anytime in the near future, a stopgap or a Band-Aid that you can kind of put on this problem is to canonicalize all of your pages. As you’re changing your server to force all of these different versions of your domain to resolve to one, at the same time you should be implementing the canonical tag on all of the pages of your website to self-canonize. On every page, you have a canonical page tag saying, “This page right here that they were already on is the canonical version of this page. ” Or if there’s another page that’s the canonical version, then obviously you point to that instead.

But having each page self-canonicalize will mitigate both the risk of duplicate content internally and some of the risk posed by scrappers, because when they scrape, if they are scraping your website and slapping it up somewhere else, those canonical tags will often stay in place, and that lets Google know this is not the real version of the website.

In conclusion, relative links, not as good. Absolute links, those are the way to go. Make sure that you’re fixing these very common domain level duplicate content problems. If your dev team tries to tell you that they don’t want to do this, just tell them I sent you. Thanks guys.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Moving 5 Domains to 1: An SEO Case Study

Posted by Dr-Pete

People often ask me if they should change domain names, and I always shudder just a little. Changing domains is a huge, risky undertaking, and too many people rush into it seeing only the imaginary upside. The success of the change also depends wildly on the details, and it’s not the kind of question anyone should be asking casually on social media.

Recently, I decided that it was time to find a new permanent home for my personal and professional blogs, which had gradually spread out over 5 domains. I also felt my main domain was no longer relevant to my current situation, and it was time for a change. So, ultimately I ended up with a scenario that looked like this:

The top three sites were active, with UserEffect.com being my former consulting site and blog (and relatively well-trafficked). The bottom two sites were both inactive and were both essentially gag sites. My one-pager, AreYouARealDoctor.com, did previously rank well for “are you a real doctor”, so I wanted to try to recapture that.

I started migrating the 5 sites in mid-January, and I’ve been tracking the results. I thought it would be useful to see how this kind of change plays out, in all of the gory details. As it turns out, nothing is ever quite “textbook” when it comes to technical SEO.

Why Change Domains at All?

The rationale for picking a new domain could fill a month’s worth of posts, but I want to make one critical point – changing domains should be about your business goals first, and SEO second. I did not change domains to try to rank better for “Dr. Pete” – that’s a crap shoot at best. I changed domains because my old consulting brand (“User Effect”) no longer represented the kind of work I do and I’m much more known by my personal brand.

That business case was strong enough that I was willing to accept some losses. We went through a similar transition here
from SEOmoz.org to Moz.com. That was a difficult transition that cost us some SEO ground, especially short-term, but our core rationale was grounded in the business and where it’s headed. Don’t let an SEO pipe dream lead you into a risky decision.

Why did I pick a .co domain? I did it for the usual reason – the .com was taken. For a project of this type, where revenue wasn’t on the line, I didn’t have any particular concerns about .co. The evidence on how top-level domains (TLDs) impact ranking is tough to tease apart (so many other factors correlate with .com’s), and Google’s attitude tends to change over time, especially if new TLDs are abused. Anecdotally, though, I’ve seen plenty of .co’s rank, and I wasn’t concerned.

Step 1 – The Boring Stuff

It is absolutely shocking how many people build a new site, slap up some 301s, pull the switch, and hope for the best. It’s less shocking how many of those people end up in Q&A a week later, desperate and bleeding money.


Planning is hard work, and it’s boring – get over it.

You need to be intimately familiar with every page on your existing site(s), and, ideally, you should make a list. Not only do you have to plan for what will happen to each of these pages, but you’ll need that list to make sure everything works smoothly later.

In my case, I decided it might be time to do some housekeeping – the User Effect blog had hundreds of posts, many outdated and quite a few just not very good. So, I started with the easy data – recent traffic. I’m sure you’ve seen this Google Analytics report (Behavior > Site Content > All Pages):

Since I wanted to focus on recent activity, and none of the sites had much new content, I restricted myself to a 3-month window (Q4 of 2014). Of course, I looked much deeper than the top 10, but the principle was simple – I wanted to make sure the data matched my intuition and that I wasn’t cutting off anything important. This helped me prioritize the list.

Of course, from an SEO standpoint, I also didn’t want to lose content that had limited traffic but solid inbound links. So, I checked my “Top Pages” report in
Open Site Explorer:

Since the bulk of my main site was a blog, the top trafficked and top linked-to pages fortunately correlated pretty well. Again, this is only a way to prioritize. If you’re dealing with sites with thousands of pages, you need to work methodically through the site architecture.

I’m going to say something that makes some SEOs itchy – it’s ok not to move some pages to the new site. It’s even ok to let some pages 404. In Q4, UserEffect.com had traffic to 237 URLs. The top 10 pages accounted for 91.9% of that traffic. I strongly believe that moving domains is a good time to refocus a site and concentrate your visitors and link equity on your best content. More is not better in 2015.

Letting go of some pages also means that you’re not 301-redirecting a massive number of old URLs to a new home-page. This can look like a low-quality attempt to consolidate link-equity, and at large scale it can raise red flags with Google. Content worth keeping should exist on the new site, and your 301s should have well-matched targets.

In one case, I had a blog post that had a decent trickle of traffic due to ranking for “50,000 push-ups,” but the post itself was weak and the bounce rate was very high:

The post was basically just a placeholder announcing that I’d be attempting this challenge, but I never recapped anything after finishing it. So, in this case,
I rewrote the post.

Of course, this process was repeated across the 3 active sites. The 2 inactive sites only constituted a handful of total pages. In the case of AreYouARealDoctor.com, I decided to turn the previous one-pager
into a new page on the new site. That way, I had a very well-matched target for the 301-redirect, instead of simply mapping the old site to my new home-page.

I’m trying to prove a point – this is the amount of work I did for a handful of sites that were mostly inactive and producing no current business value. I don’t need consulting gigs and these sites produce no direct revenue, and yet I still considered this process worth the effort.

Step 2 – The Big Day

Eventually, you’re going to have to make the move, and in most cases, I prefer ripping off the bandage. Of course, doing something all at once doesn’t mean you shouldn’t be careful.

The biggest problem I see with domain switches (even if they’re 1-to-1) is that people rely on data that can take weeks to evaluate, like rankings and traffic, or directly checking Google’s index. By then, a lot of damage is already done. Here are some ways to find out quickly if you’ve got problems…

(1) Manually Check Pages

Remember that list you were supposed to make? It’s time to check it, or at least spot-check it. Someone needs to physically go to a browser and make sure that each major section of the site and each important individual page is resolving properly. It doesn’t matter how confident your IT department/guy/gal is – things go wrong.

(2) Manually Check Headers

Just because a page resolves, it doesn’t mean that your 301-redirects are working properly, or that you’re not firing some kind of 17-step redirect chain. Check your headers. There are tons of free tools, but lately I’m fond of
URI Valet. Guess what – I screwed up my primary 301-redirects. One of my registrar transfers wasn’t working, so I had to have a setting changed by customer service, and I inadvertently ended up with 302s (Pro tip: Don’t change registrars and domains in one step):

Don’t think that because you’re an “expert”, your plan is foolproof. Mistakes happen, and because I caught this one I was able to correct it fairly quickly.

(3) Submit Your New Site

You don’t need to submit your site to Google in 2015, but now that Google Webmaster Tools allows it, why not do it? The primary argument I hear is “well, it’s not necessary.” True, but direct submission has one advantage – it’s fast.

To be precise, Google Webmaster Tools separates the process into “Fetch” and “Submit to index” (you’ll find this under “Crawl” > “Fetch as Google”). Fetching will quickly tell you if Google can resolve a URL and retrieve the page contents, which alone is pretty useful. Once a page is fetched, you can submit it, and you should see something like this:

This isn’t really about getting indexed – it’s about getting nearly instantaneous feedback. If Google has any major problems with crawling your site, you’ll know quickly, at least at the macro level.

(4) Submit New XML Sitemaps

Finally, submit a new set of XML sitemaps in Google Webmaster Tools, and preferably tiered sitemaps. While it’s a few years old now, Rob Ousbey has a great post on the subject of
XML sitemap structure. The basic idea is that, if you divide your sitemap into logical sections, it’s going to be much easier to diagnosis what kinds of pages Google is indexing and where you’re running into trouble.

A couple of pro tips on sitemaps – first, keep your old sitemaps active temporarily. This is counterintuitive to some people, but unless Google can crawl your old URLs, they won’t see and process the 301-redirects and other signals. Let the old accounts stay open for a couple of months, and don’t cut off access to the domains you’re moving.

Second (I learned this one the hard way), make sure that your Google Webmaster Tools site verification still works. If you use file uploads or meta tags and don’t move those files/tags to the new site, GWT verification will fail and you won’t have access to your old accounts. I’d recommend using a more domain-independent solution, like verifying with Google Analytics. If you lose verification, don’t panic – your data won’t be instantly lost.

Step 3 – The Waiting Game

Once you’ve made the switch, the waiting begins, and this is where many people start to panic. Even executed perfectly, it can take Google weeks or even months to process all of your 301-redirects and reevaluate a new domain’s capacity to rank. You have to expect short term fluctuations in ranking and traffic.

During this period, you’ll want to watch a few things – your traffic, your rankings, your indexed pages (via GWT and the site: operator), and your errors (such as unexpected 404s). Traffic will recover the fastest, since direct traffic is immediately carried through redirects, but ranking and indexation will lag, and errors may take time to appear.

(1) Monitor Traffic

I’m hoping you know how to check your traffic, but actually trying to determine what your new levels should be and comparing any two days can be easier said than done. If you launch on a Friday, and then Saturday your traffic goes down on the new site, that’s hardly cause for panic – your traffic probably
always goes down on Saturday.

In this case, I redirected the individual sites over about a week, but I’m going to focus on UserEffect.com, as that was the major traffic generator. That site was redirected, in full on January 21st, and the Google Analytics data for January for the old site looked like this:

So far, so good – traffic bottomed out almost immediately. Of course, losing traffic is easy – the real question is what’s going on with the new domain. Here’s the graph for January for DrPete.co:

This one’s a bit trickier – the first spike, on January 16th, is when I redirected the first domain. The second spike, on January 22nd, is when I redirected UserEffect.com. Both spikes are meaningless – I announced these re-launches on social media and got a short-term traffic burst. What we really want to know is where traffic is leveling out.

Of course, there isn’t a lot of history here, but a typical day for UserEffect.com in January was about 1,000 pageviews. The traffic to DrPete.co after it leveled out was about half that (500 pageviews). It’s not a complete crisis, but we’re definitely looking at a short-term loss.

Obviously, I’m simplifying the process here – for a large, ecommerce site you’d want to track a wide range of metrics, including conversion metrics. Hopefully, though, this illustrates the core approach. So, what am I missing out on? In this day of [not provided], tracking down a loss can be tricky. Let’s look for clues in our other three areas…

(2) Monitor Indexation

You can get a broad sense of your indexed pages from Google Webmaster Tools, but this data often lags real-time and isn’t very granular. Despite its shortcomings, I still prefer
the site: operator. Generally, I monitor a domain daily – any one measurement has a lot of noise, but what you’re looking for is the trend over time. Here’s the indexed page count for DrPete.co:

The first set of pages was indexed fairly quickly, and then the second set started being indexed soon after UserEffect.com was redirected. All in all, we’re seeing a fairly steady upward trend, and that’s what we’re hoping to see. The number is also in the ballpark of sanity (compared to the actual page count) and roughly matched GWT data once it started being reported.

So, what happened to UserEffect.com’s index after the switch?

The timeframe here is shorter, since UserEffect.com was redirected last, but we see a gradual decline in indexation, as expected. Note that the index size plateaus around 60 pages – about 1/4 of the original size. This isn’t abnormal – low-traffic and unlinked pages (or those with deep links) are going to take a while to clear out. This is a long-term process. Don’t panic over the absolute numbers – what you want here is a downward trend on the old domain accompanied by a roughly equal upward trend on the new domain.

The fact that UserEffect.com didn’t bottom out is definitely worth monitoring, but this timespan is too short for the plateau to be a major concern. The next step would be to dig into these specific pages and look for a pattern.

(3) Monitor Rankings

The old domain is dropping out of the index, and the new domain is taking its place, but we still don’t know why the new site is taking a traffic hit. It’s time to dig into our core keyword rankings.

Historically, UserEffect.com had ranked well for keywords related to “split test calculator” (near #1) and “usability checklist” (in the top 3). While [not provided] makes keyword-level traffic analysis tricky, we also know that the split-test calculator is one of the top trafficked pages on the site, so let’s dig into that one. Here’s the ranking data from Moz Analytics for “split test calculator”:

The new site took over the #1 position from the old site at first, but then quickly dropped down to the #3/#4 ranking. That may not sound like a lot, but given this general keyword category was one of the site’s top traffic drivers, the CTR drop from #1 to #3/#4 could definitely be causing problems.

When you have a specific keyword you can diagnose, it’s worth taking a look at the live SERP, just to get some context. The day after relaunch, I captured this result for “dr. pete”:

Here, the new domain is ranking, but it’s showing the old title tag. This may not be cause for alarm – weird things often happen in the very short term – but in this case we know that I accidentally set up a 302-redirect. There’s some reason to believe that Google didn’t pass full link equity during that period when 301s weren’t implemented.

Let’s look at a domain where the 301s behaved properly. Before the site was inactive, AreYouARealDoctor.com ranked #1 for “are you a real doctor”. Since there was an inactive period, and I dropped the exact-match domain, it wouldn’t be surprising to see a corresponding ranking drop.

In reality, the new site was ranking #1 for “are you a real doctor” within 2 weeks of 301-redirecting the old domain. The graph is just a horizontal line at #1, so I’m not going to bother you with it, but here’s a current screenshot (incognito):

Early on, I also spot-checked this result, and it wasn’t showing the strange title tag crossover that UserEffect.com pages exhibited. So, it’s very likely that the 302-redirects caused some problems.

Of course, these are just a couple of keywords, but I hope it provides a starting point for you to understand how to methodically approach this problem. There’s no use crying over spilled milk, and I’m not going to fire myself, so let’s move on to checking any other errors that I might have missed.

(4) Check Errors (404s, etc.)

A good first stop for unexpected errors is the “Crawl Errors” report in Google Webmaster Tools (Crawl > Crawl Errors). This is going to take some digging, especially if you’ve deliberately 404’ed some content. Over the couple of weeks after re-launch, I spotted the following problems:

The old site had a “/blog” directory, but the new site put the blog right on the home-page and had no corresponding directory. Doh. Hey, do as I say, not as I do, ok? Obviously, this was a big blunder, as the old blog home-page was well-trafficked.

The other two errors here are smaller but easy to correct. MinimalTalent.com had a “/free” directory that housed downloads (mostly PDFs). I missed it, since my other sites used a different format. Luckily, this was easy to remap.

The last error is a weird looking URL, and there are other similar URLs in the 404 list. This is where site knowledge is critical. I custom-designed a URL shortener for UserEffect.com and, in some cases, people linked to those URLs. Since those URLs didn’t exist in the site architecture, I missed them. This is where digging deep into historical traffic reports and your top-linked pages is critical. In this case, the fix isn’t easy, and I have to decide whether the loss is worth the time.

What About the New EMD?

My goal here wasn’t to rank better for “Dr. Pete,” and finally unseat Dr. Pete’s Marinades, Dr. Pete the Sodastream flavor (yes, it’s hilarious – you can stop sending me your grocery store photos), and 172 dentists. Ok, it mostly wasn’t my goal. Of course, you might be wondering how switching to an EMD worked out.

In the short term, I’m afraid the answer is “not very well.” I didn’t track ranking for “Dr. Pete” and related phrases very often before the switch, but it appears that ranking actually fell in the short-term. Current estimates have me sitting around page 4, even though my combined link profile suggests a much stronger position. Here’s a look at the ranking history for “dr pete” since relaunch (from Moz Analytics):

There was an initial drop, after which the site evened out a bit. This less-than-impressive plateau could be due to the bad 302s during transition. It could be Google evaluating a new EMD and multiple redirects to that EMD. It could be that the prevalence of natural anchor text with “Dr. Pete” pointing to my site suddenly looked unnatural when my domain name switched to DrPete.co. It could just be that this is going to take time to shake out.

If there’s a lesson here (and, admittedly, it’s too soon to tell), it’s that you shouldn’t rush to buy an EMD in 2015 in the wild hope of instantly ranking for that target phrase. There are so many factors involved in ranking for even a moderately competitive term, and your domain is just one small part of the mix.

So, What Did We Learn?

I hope you learned that I should’ve taken my own advice and planned a bit more carefully. I admit that this was a side project and it didn’t get the attention it deserved. The problem is that, even when real money is at stake, people rush these things and hope for the best. There’s a real cheerleading mentality when it comes to change – people want to take action and only see the upside.

Ultimately, in a corporate or agency environment, you can’t be the one sour note among the cheering. You’ll be ignored, and possibly even fired. That’s not fair, but it’s reality. What you need to do is make sure the work gets done right and people go into the process with eyes wide open. There’s no room for shortcuts when you’re moving to a new domain.

That said, a domain change isn’t a death sentence, either. Done right, and with sensible goals in mind – balancing not just SEO but broader marketing and business objectives – a domain migration can be successful, even across multiple sites.

To sum up: Plan, plan, plan, monitor, monitor, monitor, and try not to panic.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Firefox Extension Upgraded to include Topical Trust Flow

If you are (like me) a Firefox fan, then you’ll be delighted to know that our Firefox extension has now got an upgrade to show Topical Trust Flow® for both inbound links and Anchor Text tabs. This follows our Chrome upgrade announced on the 2nd of March. How to get the Upgrade If you already…

The post Firefox Extension Upgraded to include Topical Trust Flow appeared first on Majestic Blog.

Reblogged 4 years ago from blog.majestic.com

In-App Social & Contact Data – New in Open Site Explorer

Posted by randfish

Today I’m excited to announce the launch of a new feature inside 
Open Site Explorer—In-App Social & Contact Data. 

With this launch, you’ll be able to see the
social or email accounts we’ve discovered associated with a given website, and have one-click access to those pages.


Initially, the feature offers:

  1. Availability today on the inbound links tab and in Link Intersect on the “pages -> subdomains” view. In the future, if y’all find it useful, we hope to expand its presence to other areas of the tool as well.
  2. Email accounts will only be shown if they match the domain name (e.g. rand@moz.com would be shown next to moz.com, randfishkin@yahoo.com would not) and if they appear in standard format on the page (we don’t try to grab emails in JavaScript or that use alternate formats to obsfucate).
  3. We show Facebook, Twitter, Google+, and email addresses we’ve found on multiple pages of the site (we take a small random set and analyze whether these social/contact data pieces are uniform). If we find multiple accounts, you’ll see this:

Use cases

There are three major use cases for this feature (at least for me; you might have more!):

1) Link/Outreach prospecting

It can be a pain to visit sites, find social accounts/emails, and copy them into a spreadsheet or send messages (and recall which ones you have/haven’t done yet). By including social/contact data in the same interface where you’re doing link analysis, we hope to save you time and clicks.

2) Link/site trust and audience reach analysis

We’re actually using this data on the back end at Moz for our upcoming Spam Score feature (coming very soon), but you can use it manually to help with a quick mental filter for trustworthy/authoritative/non-spammy sites, and to get a sense for the size and reach of a site’s social audience.

3) At-a-glance analysis of social networks among a group

If you’re in a given space (e.g. travel blogs), it’s a process to determine which social networks are/aren’t being used by industry participants and influencers. Social/contact data in OSE can help with that by showing which social networks various sites are using and linking to from their pages:

We need your feedback

This first implementation is relatively light in the app—we haven’t yet placed this data anywhere/everywhere it might be useful. Before we do, we want to hear what you think: Is this useful and valuable to your work? Does it help save you time? Would you want to see the feature expanded and if so, in what sections would it provide the greatest value to you? Please let us know in the comments, and by getting back in touch with us after you’ve had a chance to try it out for yourself.

Thanks for giving social/contact data a spin, and look for more upgrades to Open Site Explorer in the very near future!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

The Most Important Link Penalty Removal Tool: Your Mindset

Posted by Eric Enge

Let’s face it. Getting slapped by a manual link penalty, or by the Penguin algorithm, really stinks. Once this has happened to you, your business is in a world of hurt. Worse still is the fact that you can’t get clear information from Google on which of your links are the bad ones. In today’s post, I am going to focus on the number one reason why people fail to get out from under these types of problems, and how to improve your chances of success.

The mindset

Success begins, continues, and ends with the right mindset. A large percentage of people I see who go through a link cleanup process are not aggressive enough about cleaning up their links. They worry about preserving some of that hard-won link juice they obtained over the years.

You have to start by understanding what a link cleanup process looks like, and just how long it can take. Some of the people I have spoken with have gone through a process like this one:

link removal timeline

In this fictitious timeline example, we see someone who spends four months working on trying to recover, and at the end of it all, they have not been successful.
A lot of time and money have been spent, and they have nothing to show for it. Then, the people at Google get frustrated and send them a message that basically tells them they are not getting it. At this point, they have no idea when they will be able to recover. The result is that the complete process might end up taking six months or more.

In contrast, imagine someone who is far more aggressive in removing and disavowing links. They are so aggressive that 20 percent of the links they cut out are actually ones that Google has not currently judged as being bad. They also start on March 9, and by April 30, the penalty has been lifted on their site.

Now they can begin rebuilding their business, five or months sooner than the person who does not take as aggressive an approach. Yes, they cut out some links that Google was not currently penalizing, but this is a small price to pay for getting your penalty cleared five months sooner. In addition, using our mindset-based approach, the 20 percent of links we cut out were probably not links that were helping much anyway, and that Google might also take action on them in the future.

Now that you understand the approach, it’s time to make the commitment. You have to make the decision that you are going to do whatever it takes to get this done, and that getting it done means cutting hard and deep, because that’s what will get you through it the fastest. Once you’ve got your head on straight about what it will take and have summoned the courage to go through with it, then and only then, you’re ready to do the work. Now let’s look at what that work entails.

Obtaining link data

We use four sources of data for links:

  1. Google Webmaster Tools
  2. Open Site Explorer
  3. Majestic SEO
  4. ahrefs

You will want to pull in data from all four of these sources, get them into one list, and then dedupe them to create a master list. Focus only on followed links as well, as nofollowed links are not an issue. The overall process is shown here:

pulling a link set

One other simplification is also possible at this stage. Once you have obtained a list of the followed links, there is another thing you can do to dramatically simplify your life.
You don’t need to look at every single link.

You do need to look at a small sampling of links from every domain that links to you. Chances are that this is a significantly smaller quantity of links to look at than all links. If a domain has 12 links to you, and you look at three of them, and any of those are bad, you will need to disavow the entire domain anyway.

I take the time to emphasize this because I’ve seen people with more than 1 million inbound links from 10,000 linking domains. Evaluating 1 million individual links could take a lifetime. Looking at 10,000 domains is not small, but it’s 100 times smaller than 1 million. But here is where the mindset comes in.
Do examine every domain.

This may be a grinding and brutal process, but there is no shortcut available here. What you don’t look at will hurt you. The sooner you start on the entire list, the sooner you will get the job done.

How to evaluate links

Now that you have a list, you can get to work. This is a key part where having the right mindset is critical. The first part of the process is really quite simple. You need to eliminate each and every one of these types of links:

  1. Article directory links
  2. Links in forum comments, or their related profiles
  3. Links in blog comments, or their related profiles
  4. Links from countries where you don’t operate/sell your products
  5. Links from link sharing schemes such as Link Wheels
  6. Any links you know were paid for

Here is an example of a foreign language link that looks somewhat out of place:

foreign language link

For the most part, you should also remove any links you have from web directories. Sure, if you have a link from DMOZ, Business.com, or BestofTheWeb.com, and the most important one or two directories dedicated to your market space, you can probably keep those.

For a decade I have offered people a rule for these types of directories, which is “no more than seven links from directories.” Even the good ones carry little to no value, and the bad ones can definitely hurt you. So there is absolutely no win to be had running around getting links from a bunch of directories, and there is no win in trying to keep them during a link cleanup process.

Note that I am NOT talking about local business directories such as Yelp, CityPages, YellowPages, SuperPages, etc. Those are a different class of directory that you don’t need to worry about. But general purpose web directories are, generally speaking, a poison.

Rich anchor text

Rich anchor text has been the downfall of many a publisher. Here is one of my favorite examples ever of rich anchor text:

The author wanted the link to say “buy cars,” but was too lazy to fit the two words into the same sentence! Of course, you may have many guest posts that you have written that are not nearly as obvious as this one. One great way to deal with that is to take your list of links that you built and sort them by URL and look at the overall mix of anchor text. You know it’s a problem if it looks anything like this:

overly optimized anchor text

The problem with the distribution in the above image is that the percentage of links that are non “rich” in nature is way too small. In the real world, most people don’t conveniently link to you using one of your key money phrases. Some do, but it’s normally a small percentage.

Other types of bad links

There is no way for me to cover every type of bad link in this post, but here are other types of links, or link scenarios, to be concerned about:

  1. If a large percentage of your links are coming from over on the right rail of sites, or in the footers of sites
  2. If there are sites that give you a site-wide link, or a very large number of links from one domain
  3. Links that come from sites whose IP address is identical in the A block, B block, and C block (read more about what these are here)
  4. Links from crappy sites

The definition of a crappy site may seem subjective, but if a site has not been updated in a while, or its information is of poor quality, or it just seems to have no one who cares about it, you can probably consider it a crappy site. Remember our discussion on mindset. Your objective is to be harsh in cleaning up your links.

In fact, the most important principle in evaluating links is this:
If you can argue that it’s a good link, it’s NOT. You don’t have to argue for good quality links. To put it another way, if they are not obviously good, then out they go!

Quick case study anecdote: I know of someone who really took a major knife to their backlinks. They removed and/or disavowed every link they had that was below a Moz Domain Authority of 70. They did not even try to justify or keep any links with lower DA than that. It worked like a champ. The penalty was lifted. If you are willing to try a hyper-aggressive approach like this one, you can avoid all the work evaluating links I just outlined above. Just get the Domain Authority data for all the links pointing to your site and bring out the hatchet.

No doubt that they ended up cutting out a large number of links that were perfectly fine, but their approach was way faster than doing the complete domain by domain analysis.

Requesting link removals

Why is it that we request link removals? Can’t we just build a
disavow file and submit that to Google? In my experience, for manual link penalties, the answer to this question is no, you can’t. (Note: if you have been hit by Penguin, and not a manual link penalty, you may not need to request link removals.)

Yes, disavowing a link is supposed to tell Google that you don’t want to receive any PageRank, or benefit, from it. However, there is a human element at play here.
Google likes to see that you put some effort into cleaning up the bad links that you have gotten that led to your penalty. The more bad links you have, the more important this becomes.

This does make the process a lot more expensive to get through, but if you approach this with the “whatever it takes” mindset, you dive into the requesting link removal process and go ahead and get it done.

I usually have people go through three rounds of requests asking people to remove links. This can be a very annoying process for those receiving your request, so you need to be aware of that. Don’t start your email with a line like “Your site is causing mine to be penalized …”, as that’s just plain offensive.

I’d be honest, and tell them “Hey, we’ve been hit by a penalty, and as part of our effort to recover we are trying to get many of the links we have gotten to our site removed. We don’t know which sites are causing the problem, but we’d appreciate your help …”

Note that some people will come back to you and ask for money to remove the link. Just ignore them, and put their domains in your disavow file.

Once you are done with the overall removal requests, and had whatever success you have had, take the rest of the domains and disavow them. There is a complete guide to
creating a disavow file here. The one incremental tip I would add is that you should nearly always disavow entire domains, not just the individual links you see.

This is important because even with the four tools we used to get information on as many links as we could, we still only have a subset of the total links. For example, the tools may have only seen one link from a domain, but in fact you have five. If you disavow only the one link, you still have four problem links, and that will torpedo your reconsideration request.

Disavowing the domain is a better-safe-than-sorry step you should take almost every time. As I illustrated at the beginning of this post, adding extra cleanup/reconsideration request loops is very expensive for your business.

The overall process

When all is said and done, the process looks something like this:

link removal process

If you run this process efficiently, and you don’t try to cut corners, you might be able to get out from your penalty in a single pass through the process. If so, congratulations!

What about tools?

There are some fairly well-known tools that are designed to help you with the link cleanup process. These include
Link Detox and Remove’em. In addition, at STC we have developed our own internal tool that we use with our clients.

These tools can be useful in flagging some of your links, but they are not comprehensive—they will help identify some really obvious offenders, but the great majority of links you need to deal with and remove/disavow are not identified. Plan on investing substantial manual time and effort to do the heavy lifting of a comprehensive review of all your links. Remember the “mindset.”

Summary

As I write this post, I have this sense of being heartless because I outline an approach that is often grueling to execute. But consider it tough love. Recovering from link penalties is indeed brutal.
In my experience, the winners are the ones who come with meat cleaver in hand, don’t try to cut corners, and take on the full task from the very start, no matter how extensive an effort it may be.

Does this type of process succeed? You bet. Here is an example of a traffic chart from a successful recovery:

manual penalty recovery graph

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

12 Common Reasons Reconsideration Requests Fail

Posted by Modestos

There are several reasons a reconsideration request might fail. But some of the most common mistakes site owners and inexperienced SEOs make when trying to lift a link-related Google penalty are entirely avoidable. 

Here’s a list of the top 12 most common mistakes made when submitting reconsideration requests, and how you can prevent them.

1. Insufficient link data

This is one of the most common reasons why reconsideration requests fail. This mistake is readily evident each time a reconsideration request gets rejected and the example URLs provided by Google are unknown to the webmaster. Relying only on Webmaster Tools data isn’t enough, as Google has repeatedly said. You need to combine data from as many different sources as possible. 

A good starting point is to collate backlink data, at the very least:

  • Google Webmaster Tools (both latest and sample links)
  • Bing Webmaster Tools
  • Majestic SEO (Fresh Index)
  • Ahrefs
  • Open Site Explorer

If you use any toxic link-detection services (e.g., Linkrisk and Link Detox), then you need to take a few precautions to ensure the following:

  • They are 100% transparent about their backlink data sources
  • They have imported all backlink data
  • You can upload your own backlink data (e.g., Webmaster Tools) without any limitations

If you work on large websites that have tons of backlinks, most of these automated services are very likely used to process just a fraction of the links, unless you pay for one of their premium packages. If you have direct access to the above data sources, it’s worthwhile to download all backlink data, then manually upload it into your tool of choice for processing. This is the only way to have full visibility over the backlink data that has to be analyzed and reviewed later. Starting with an incomplete data set at this early (yet crucial) stage could seriously hinder the outcome of your reconsideration request.

2. Missing vital legacy information

The more you know about a site’s history and past activities, the better. You need to find out (a) which pages were targeted in the past as part of link building campaigns, (b) which keywords were the primary focus and (c) the link building tactics that were scaled (or abused) most frequently. Knowing enough about a site’s past activities, before it was penalized, can help you home in on the actual causes of the penalty. Also, collect as much information as possible from the site owners.

3. Misjudgement

Misreading your current situation can lead to wrong decisions. One common mistake is to treat the example URLs provided by Google as gospel and try to identify only links with the same patterns. Google provides a very small number of examples of unnatural links. Often, these examples are the most obvious and straightforward ones. However, you should look beyond these examples to fully address the issues and take the necessary actions against all types of unnatural links. 

Google is very clear on the matter: “Please correct or remove all inorganic links, not limited to the samples provided above.

Another common area of bad judgement is the inability to correctly identify unnatural links. This is a skill that requires years of experience in link auditing, as well as link building. Removing the wrong links won’t lift the penalty, and may also result in further ranking drops and loss of traffic. You must remove the right links.


4. Blind reliance on tools

There are numerous unnatural link-detection tools available on the market, and over the years I’ve had the chance to try out most (if not all) of them. Because (and without any exception) I’ve found them all very ineffective and inaccurate, I do not rely on any such tools for my day-to-day work. In some cases, a lot of the reported “high risk” links were 100% natural links, and in others, numerous toxic links were completely missed. If you have to manually review all the links to discover the unnatural ones, ensuring you don’t accidentally remove any natural ones, it makes no sense to pay for tools. 

If you solely rely on automated tools to identify the unnatural links, you will need a miracle for your reconsideration request to be successful. The only tool you really need is a powerful backlink crawler that can accurately report the current link status of each URL you have collected. You should then manually review all currently active links and decide which ones to remove. 

I could write an entire book on the numerous flaws and bugs I have come across each time I’ve tried some of the most popular link auditing tools. A lot of these issues can be detrimental to the outcome of the reconsideration request. I have seen many reconsiderations request fail because of this. If Google cannot algorithmically identify all unnatural links and must operate entire teams of humans to review the sites (and their links), you shouldn’t trust a $99/month service to identify the unnatural links.

If you have an in-depth understanding of Google’s link schemes, you can build your own process to prioritize which links are more likely to be unnatural, as I described in this post (see sections 7 & 8). In an ideal world, you should manually review every single link pointing to your site. Where this isn’t possible (e.g., when dealing with an enormous numbers of links or resources are unavailable), you should at least focus on the links that have the more “unnatural” signals and manually review them.

5. Not looking beyond direct links

When trying to lift a link-related penalty, you need to look into all the links that may be pointing to your site directly or indirectly. Such checks include reviewing all links pointing to other sites that have been redirected to your site, legacy URLs with external inbound links that have been internally redirected owned, and third-party sites that include cross-domain canonicals to your site. For sites that used to buy and redirect domains in order increase their rankings, the quickest solution is to get rid of the redirects. Both Majestic SEO and Ahrefs report redirects, but some manual digging usually reveals a lot more.

PQPkyj0.jpg

6. Not looking beyond the first link

All major link intelligence tools, including Majestic SEO, Ahrefs and Open Site Explorer, report only the first link pointing to a given site when crawling a page. This means that, if you overly rely on automated tools to identify links with commercial keywords, the vast majority of them will only take into consideration the first link they discover on a page. If a page on the web links just once to your site, this is not big deal. But if there are multiple links, the tools will miss all but the first one.

For example, if a page has five different links pointing to your site, and the first one includes a branded anchor text, these tools will just report the first link. Most of the link-auditing tools will in turn evaluate the link as “natural” and completely miss the other four links, some of which may contain manipulative anchor text. The more links that get missed this way the more likely your reconsideration request will fail.

7. Going too thin

Many SEOs and webmasters (still) feel uncomfortable with the idea of losing links. They cannot accept the idea of links that once helped their rankings are now being devalued, and must be removed. There is no point trying to save “authoritative”, unnatural links out of fear of losing rankings. If the main objective is to lift the penalty, then all unnatural links need to be removed.

Often, in the first reconsideration request, SEOs and site owners tend to go too thin, and in the subsequent attempts start cutting deeper. If you are already aware of the unnatural links pointing to your site, try to get rid of them from the very beginning. I have seen examples of unnatural links provided by Google on PR 9/DA 98 sites. Metrics do not matter when it comes to lifting a penalty. If a link is manipulative, it has to go.

In any case, Google’s decision won’t be based only on the number of links that have been removed. Most important in the search giant’s eyes are the quality of links still pointing to your site. If the remaining links are largely of low quality, the reconsideration request will almost certainly fail. 

8. Insufficient effort to remove links

Google wants to see a “good faith” effort to get as many links removed as possible. The higher the percentage of unnatural links removed, the better. Some agencies and SEO consultants tend to rely too much on the use of the disavow tool. However, this isn’t a panacea, and should be used as a last resort for removing those links that are impossible to remove—after exhausting all possibilities to physically remove them via the time-consuming (yet necessary) outreach route. 

Google is very clear on this:

m4M4n3g.jpg?1

Even if you’re unable to remove all of the links that need to be removed, you must be able to demonstrate that you’ve made several attempts to have them removed, which can have a favorable impact on the outcome of the reconsideration request. Yes, in some cases it might be possible to have a penalty lifted simply by disavowing instead of removing the links, but these cases are rare and this strategy may backfire in the future. When I reached out to ex-googler Fili Wiese’s for some advice on the value of removing the toxic links (instead of just disavowing them), his response was very straightforward:

V3TmCrj.jpg 

9. Ineffective outreach

Simply identifying the unnatural links won’t get the penalty lifted unless a decent percentage of the links have been successfully removed. The more communication channels you try, the more likely it is that you reach the webmaster and get the links removed. Sending the same email hundreds or thousands of times is highly unlikely to result in a decent response rate. Trying to remove a link from a directory is very different from trying to get rid of a link appearing in a press release, so you should take a more targeted approach with a well-crafted, personalized email. Link removal request emails must be honest and to the point, or else they’ll be ignored.

Tracking the emails will also help in figuring out which messages have been read, which webmasters might be worth contacting again, or alert you of the need to try an alternative means of contacting webmasters.

Creativity, too, can play a big part in the link removal process. For example, it might be necessary to use social media to reach the right contact. Again, don’t trust automated emails or contact form harvesters. In some cases, these applications will pull in any email address they find on the crawled page (without any guarantee of who the information belongs to). In others, they will completely miss masked email addresses or those appearing in images. If you really want to see that the links are removed, outreach should be carried out by experienced outreach specialists. Unfortunately, there aren’t any shortcuts to effective outreach.

10. Quality issues and human errors

All sorts of human errors can occur when filing a reconsideration request. The most common errors include submitting files that do not exist, files that do not open, files that contain incomplete data, and files that take too long to load. You need to triple-check that the files you are including in your reconsideration request are read-only, and that anyone with the URL can fully access them. 

Poor grammar and language is also bad practice, as it may be interpreted as “poor effort.” You should definitely get the reconsideration request proofread by a couple of people to be sure it is flawless. A poorly written reconsideration request can significantly hinder your overall efforts.

Quality issues can also occur with the disavow file submission. Disavowing at the URL level isn’t recommended because the link(s) you want to get rid of are often accessible to search engines via several URLs you may be unaware of. Therefore, it is strongly recommended that you disavow at the domain or sub-domain level.

11. Insufficient evidence

How does Google know you have done everything you claim in your reconsideration request? Because you have to prove each claim is valid, you need to document every single action you take, from sent emails and submitted forms, to social media nudges and phone calls. The more information you share with Google in your reconsideration request, the better. This is the exact wording from Google:

“ …we will also need to see good-faith efforts to remove a large portion of inorganic links from the web wherever possible.”

12. Bad communication

How you communicate your link cleanup efforts is as essential as the work you are expected to carry out. Not only do you need to explain the steps you’ve taken to address the issues, but you also need to share supportive information and detailed evidence. The reconsideration request is the only chance you have to communicate to Google which issues you have identified, and what you’ve done to address them. Being honest and transparent is vital for the success of the reconsideration request.

There is absolutely no point using the space in a reconsideration request to argue with Google. Some of the unnatural links examples they share may not always be useful (e.g., URLs that include nofollow links, removed links, or even no links at all). But taking the argumentative approach veritably guarantees your request will be denied.

54adb6e0227790.04405594.jpg
Cropped from photo by Keith Allison, licensed under Creative Commons.

Conclusion

Getting a Google penalty lifted requires a good understanding of why you have been penalized, a flawless process and a great deal of hands-on work. Performing link audits for the purpose of lifting a penalty can be very challenging, and should only be carried out by experienced consultants. If you are not 100% sure you can take all the required actions, seek out expert help rather than looking for inexpensive (and ineffective) automated solutions. Otherwise, you will almost certainly end up wasting weeks or months of your precious time, and in the end, see your request denied.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from moz.com