5 Spreadsheet Tips for Manual Link Audits

Posted by MarieHaynes

Link auditing is the part of my job that I love the most. I have audited a LOT of links over the last few years. While there are some programs out there that can be quite helpful to the avid link auditor, I still prefer to create a spreadsheet of my links in Excel and then to audit those links one-by-one from within Google Spreadsheets. Over the years I have learned a few tricks and formulas that have helped me in this process. In this article, I will share several of these with you.

Please know that while I am quite comfortable being labelled a link auditing expert, I am not an Excel wizard. I am betting that some of the things that I am doing could be improved upon if you’re an advanced user. As such, if you have any suggestions or tips of your own I’d love to hear them in the comments section!

1. Extract the domain or subdomain from a URL

OK. You’ve downloaded links from as many sources as possible and now you want to manually visit and evaluate one link from every domain. But, holy moly, some of these domains can have THOUSANDS of links pointing to the site. So, let’s break these down so that you are just seeing one link from each domain. The first step is to extract the domain or subdomain from each url.

I am going to show you examples from a Google spreadsheet as I find that these display nicer for demonstration purposes. However, if you’ve got a fairly large site, you’ll find that the spreadsheets are easier to create in Excel. If you’re confused about any of these steps, check out the animated gif at the end of each step to see the process in action.

Here is how you extract a domain or subdomain from a url:

  • Create a new column to the left of your url column.
  • Use this formula:

    =LEFT(B1,FIND(“/”,B1,9)-1)

    What this will do is remove everything after the trailing slash following the domain name. http://www.example.com/article.html will now become http://www.example.com and http://www.subdomain.example.com/article.html will now become http://www.subdomain.example.com.

  • Copy our new column A and paste it right back where it was using the “paste as values” function. If you don’t do this, you won’t be able to use the Find and Replace feature.
  • Use Find and Replace to replace each of the following with a blank (i.e. nothing):
    http://
    https://
    www.

And BOOM! We are left with a column that contains just domain names and subdomain names. This animated gif shows each of the steps we just outlined:

2. Just show one link from each domain

The next step is to filter this list so that we are just seeing one link from each domain. If you are manually reviewing links, there’s usually no point in reviewing every single link from every domain. I will throw in a word of caution here though. Sometimes a domain can have both a good link and a bad link pointing to you. Or in some cases, you may find that links from one page are followed and from another page on the same site they are nofollowed. You can miss some of these by just looking at one link from each domain. Personally, I have some checks built in to my process where I use Scrapebox and some internal tools that I have created to make sure that I’m not missing the odd link by just looking at one link from each domain. For most link audits, however, you are not going to miss very much by assessing one link from each domain.

Here’s how we do it:

  • Highlight our domains column and sort the column in alphabetical order.
  • Create a column to the left of our domains, so that the domains are in column B.
  • Use this formula:

    =IF(B1=B2,”duplicate”,”unique”)

  • Copy that formula down the column.
  • Use the filter function so that you are just seeing the duplicates.
  • Delete those rows. Note: If you have tens of thousands of rows to delete, the spreadsheet may crash. A workaround here is to use “Clear Rows” instead of “Delete Rows” and then sort your domains column from A-Z once you are finished.

We’ve now got a list of one link from every domain linking to us.

Here’s the gif that shows each of these steps:

You may wonder why I didn’t use Excel’s dedupe function to simply deduplicate these entries. I have found that it doesn’t take much deduplication to crash Excel, which is why I do this step manually.

3. Finding patterns FTW!

Sometimes when you are auditing links, you’ll find that unnatural links have patterns. I LOVE when I see these, because sometimes I can quickly go through hundreds of links without having to check each one manually. Here is an example. Let’s say that your website has a bunch of spammy directory links. As you’re auditing you notice patterns such as one of these:

  • All of these directory links come from a url that contains …/computers/internet/item40682/
  • A whole bunch of spammy links that all come from a particular free subdomain like blogspot, wordpress, weebly, etc.
  • A lot of links that all contain a particular keyword for anchor text (this is assuming you’ve included anchor text in your spreadsheet when making it.)

You can quickly find all of these links and mark them as “disavow” or “keep” by doing the following:

  • Create a new column. In my example, I am going to create a new column in Column C and look for patterns in urls that are in Column B.
  • Use this formula:

    =FIND(“/item40682”,B1)
    (You would replace “item40682” with the phrase that you are looking for.)

  • Copy this formula down the column.
  • Filter your new column so that you are seeing any rows that have a number in this column. If the phrase doesn’t exist in that url, you’ll see “N/A”, and we can ignore those.
  • Now you can mark these all as disavow

4. Check your disavow file

This next tip is one that you can use to check your disavow file across your list of domains that you want to audit. The goal here is to see which links you have disavowed so that you don’t waste time reassessing them. This particular tip only works for checking links that you have disavowed on the domain level.

The first thing you’ll want to do is download your current disavow file from Google. For some strange reason, Google gives you the disavow file in CSV format. I have never understood this because they want you to upload the file in .txt. Still, I guess this is what works best for Google. All of your entries will be in column A of the CSV:

What we are going to do now is add these to a new sheet on our current spreadsheet and use a VLOOKUP function to mark which of our domains we have disavowed.

Here are the steps:

  • Create a new sheet on your current spreadsheet workbook.
  • Copy and paste column A from your disavow spreadsheet onto this new sheet. Or, alternatively, use the import function to import the entire CSV onto this sheet.
  • In B1, write “previously disavowed” and copy this down the entire column.
  • Remove the “domain:” from each of the entries by doing a Find and Replace to replace domain: with a blank.
  • Now go back to your link audit spreadsheet. If your domains are in column A and if you had, say, 1500 domains in your disavow file, your formula would look like this:

    =VLOOKUP(A1,Sheet2!$A$1:$B$1500,2,FALSE)

When you copy this formula down the spreadsheet, it will check each of your domains, and if it finds the domain in Sheet 2, it will write “previously disavowed” on our link audit spreadsheet.

Here is a gif that shows the process:

5. Make monthly or quarterly disavow work easier

That same formula described above is a great one to use if you are doing regular repeated link audits. In this case, your second sheet on your spreadsheet would contain domains that you have previously audited, and column B of this spreadsheet would say, “previously audited” rather than “previously disavowed“.

Your tips?

These are just a few of the formulas that you can use to help make link auditing work easier. But there are lots of other things you can do with Excel or Google Sheets to help speed up the process as well. If you have some tips to add, leave a comment below. Also, if you need clarification on any of these tips, I’m happy to answer questions in the comments section.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

How to Combine Screaming Frog Data with Google Analytics Data

Posted by Iamoldskool

I love Screaming Frog. It is without doubt the best SEO tool I use on a daily basis (no offense, Moz). The sheer amount of data you can get about your website, or someone else’s website, is incredible. You can find broken links, you can check for your Google Analytics (or any other) code on all pages through the custom search, and you can even go so far as to follow all the redirects and find out the redirect paths in a website.

In this quick guide, I’m going to show how Screaming Frog data can be used to help perform a content audit.

The data in Screaming Frog is incredible, but one thing it can’t do (yet…give it time) is tell you how popular your pages are. For that, you need an analytics package. We’re going to be working with Google Analytics on this one, as it’s probably the most well known (and well used) of the analytics services out there, and we’re going to combine the two data streams into one to give you a full overview of your content and just how popular it is. As this data is from a website I work with (rather than my own), I’m going to hide the URLs in the screenshots for obvious reasons.

Why would you want to do this?

Combining Google Analytics data with your Screaming Frog data has a myriad of advantages. You can get an overall picture of your site and identify any issues that are occurring on popular pages. You can see which pages within your site have no page views at all, or the ones that have very few page views. Maybe there are issues on these pages that become immediately apparent when you combine the two datasets.

Getting your data

Step 1: Screaming Frog

Spider the website you’re working with in Screaming Frog. Just type the URL in the box and click go, and off it goes getting all the data from your website.

Filter the list to just include HTML and hit export:

Step 2: Google Analytics

Head over to Google Analytics and go to the “All Pages” tab:

Set a decent data range of a couple of months so you get some decent data (especially if it’s a low traffic site), and set “show rows” at the bottom to 5,000 so you get as much data as possible.

“Hang on a minute, Jim,” you’re saying….I have a lot more than 5,000 in my list. How do I get the rest? Well, that’s a simple hack. Go to the URL at the top and look at the end of it for the 5000. It will look something like this:

Now just up that figure to cover all of your page views, and you’ll have a huge long list. I have 9,347 on my list, so I’m going to up it to 10,000.

Great. Now export that data to an Excel file:

Now you have the two sets of data in Microsoft Excel format. Next, we’re going to combine these two data sources into one

First step. Open them up and put them both into a single excel file on different worksheets, then label them so you know which is which:

Now, make a third empty worksheet for your compiled data. Here’s a view of the worksheets you should have at this point:

To make this work, we’ll need the URL (page name column) to be the same on both sheets. The Screaming Frog data contains the domain, where as the GA data doesn’t, so use find and replace on the Screaming Frog data to remove the domain up to the first trailing slash. The two data sources should now have URLs that match.

With me so far? Great. Now it’s time to link the data sets together and get that lovely combined data in your third worksheet.

Linking the data

OK. Go to your Screaming Frog worksheet and select all the data and on the formula tab, click define name – give it an easily identifiable name (I would name it the same as your worksheet).

Then do the same with the GA data: Select it > Formula Tab > Define name > Name it the same as the worksheet.

Got both of them defined? Groovy, time to put this data together.

Save your file.

Go to your third worksheet, named “compiled data.”

Then on the data tab, select “From Other Sources” then From Microsoft Query.

It will then ask you to choose your data source, choose excel file from the options and click OK. Then, find your saved Excel file and select it; you’ll be given the option to include your two named data sources.

Select both, and add them to columns in your query. Click next, you’ll then be presented with what looks like an error message (but isn’t really).

Click OK.

Then drag “Page” on the GA Data onto “Address” on the Screaming Frog Data like this

And, you’ll notice all the data from the two data sources below will reorganise itself.

Then, click file > “Return data to Microsoft Excel.”

On the next one, just click ok… and that’s it. You should now have a single worksheet with the combined data from Screaming Frog and Google Analytics to play with and do what you want.

Hope my little tutorial made sense and people find it of use. I’d love to hear what other people use this tutorial to accomplish in the comments 🙂

Thanks all!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

Calculating Estimated ROI for a Specific Site & Body of Keywords

Posted by shannonskinner

One of the biggest challenges for SEO is proving its worth. We all know it’s valuable, but it’s important to convey its value in terms that key stakeholders (up to and including CEOs) understand. To do that, I put together a process to calculate an estimate of ROI for implementing changes to keyword targeting.

In this post, I will walk through that process, so hopefully you can do the same for your clients (or as an in-house SEO to get buy-in), too!

Overview

  1. Gather your data
    1. Keyword Data
    2. Strength of your Preferred URLs
    3. Competition URLs by Keyword
    4. Strength of Competition URLs
  2. Analyze the Data by Keyword
  3. Calculate your potential opportunity

What you need

There are quite a few parts to this recipe, and while the calculation part is pretty easy, gathering the data to throw in the mix is the challenging part. I’ll list each section here, including the components of each, and then we can go through how to retrieve each of them. 

  • Keyword data

    • list of keywords
    • search volumes for each keyword
    • preferred URLs on the site you’re estimating ROI
    • current rank
    • current ranking URL
  • Strength of your preferred URLs

    • De-duplicated list of preferred URLs
    • Page Authorities for each preferred URL
    • BONUS: External & Internal Links for each URL. You can include any measure you like here, as long as it’s something that can be compared (i.e. a number).
  • Where the competition sits

    • For each keyword, the sites that are ranking 1-10 in search currently
  • Strength of the competition

    • De-duplicated list of competing URLs
    • Page Authorities, Domain Authorities, 
    • BONUS: External & Internal Links, for each competing URL. Include any measure you’ve included on the Strength of Your Preferred URLs list.


How to get what you need


There has been quite a lot written about keyword research, so I won’t go into too much detail here. For the Keyword data list, the important thing is to get whatever keywords you’d like to assess into a spreadsheet, and include all the information listed above. You’ll have to select the preferred URLs based on what you think the strongest-competing and most appropriate URL would be for each keyword. 


For the
Preferred URLs list, you’ll want to use the data that’s in your keyword data under the preferred URL.

  1. Copy the preferred URL data from your Keyword Data into a new tab. 
  2. Use the Remove Duplicates tool (Data>Data Tools in Excel) to remove any duplicated URLs

Once you have the list of de-duplicated preferred URLs, you’ll need to pull the data from Open Site Explorer for these URLs. I prefer using the Moz API with SEOTools. You’ll have to install it to use it for Excel, or if you’d like to take a stab at using it in Google Docs, there are some resources available for that. Unfortunately, with the most recent update to Google Spreadsheets, I’ve had some difficulty with this method, so I’ve gone with Excel for now. 

Once you’ve got SEOTools installed, you can make the call “=MOZ_URLMetrics_toFit([enter your cells])”. This should give you a list of URL titles, canonical URLs, External & Internal links, as well as a few other metrics and DA/PA. 


For the
Where the competition sits list, you’ll first need to perform a search for each of your keywords. Obviously, you could do this manually, or if you have exportable data from a keyword ranking tool and you’ve been ranking the keywords you’d like to look at, you could use either of these methods. If you don’t have those, you can use the hacky method that I did–basically, use the ImportXML command in Google Spreadsheets to grab the top ranking URLs for each query. 

I’ve put a sample version of this together, which you can access here. A few caveats: you should be able to run MANY searches in a row–I had about 850 for my data, and they ran fine. Google will block your IP address, though, if you run too many, and what I found is that I needed to copy out my results as values into a different spreadsheet once I’d gotten them, because they timed out relatively quickly, but you can just put them into the Excel spreadsheet you’re building to make the ROI calculations (you’ll need them there anyway!).


From this list, you can pull each URL into a single list, and de-duplicate as explained for the preferred URLs list to generate the
Strength of the Competition list, and then run the analysis you did with the preferred URLs to generate the same data for these URLs as you did for the preferred URLs with SEOTools for Excel. 


Making your data work for you

Once you’ve got these lists, you can use some VLOOKUP magic to pull in the information you need. I used the
Where the competition sits list as the foundation of my work. 

From there, I pulled in the corresponding preferred URL and its Page Authority, as well as the PAs and DAs for each URL currently ranking 1-10. I then was able to calculate an average PA & DA for each query, and could compare the page I want to rank to this. I estimated the chances that the page I wanted to rank (given that I’d already determined these were relevant pages) could rank with better keyword targeting.

Here’s where things get interesting. You can be rather conservative, and only sum search volumes of keywords you’re fairly confident your site can rank, which is my preferred method. That’s because I use this method primarily to determine if I’m on the right track–whether making these recommendations are really worth the time to get implemented. So I’m going to move forward assuming I’m counting only the search volumes of terms I think I’m quite competitive for, AND that I’m not yet ranking for on page 1. 


Now, you want to move to your analytics data in order to calculate a few things: 

  • Conversion Rate
  • Average order value
  • Previous year’s revenue (for the section you’re looking at)

I’ve set up my sample data in this spreadsheet that you can refer to or use to make your own calculations. 

Each of the assumptions can be adjusted depending on the actual site data, or using estimates. I’m using very very generic overall CTR estimates, but you can select which you’d like and get as granular as you want! The main point for me is really getting to two numbers that I can stand by as pretty good estimates: 

  • Annual Impact (Revenue $$)
  • Increase in Revenue ($$) from last year

This is because, for higher-up folks, money talks. Obviously, this won’t be something you can promise, but it gives them a metric that they understand to really wrap their head around the value that you’re potentially brining to the table if the changes you’re recommending can be made. 

There are some great tools for estimating this kind of stuff on a smaller scale, but for a massive body of keyword data, hopefully you will find this process useful as well. Let me know what you think, and I’d love to see what parts anyone else can streamline or make even more efficient. 

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]