Videos: How-To use the URL Submitter, Search Explorer and more

We have produced FOUR more “how-to” videos so that you can better understand the Majestic tools and Topical Trust Flow theme. These videos are: How To Use The Majestic URL Submitter – a tool that allows you to submit websites for Majestic to crawl and extract link data. How To Use The Search Explorer Tool…

The post Videos: How-To use the URL Submitter, Search Explorer and more appeared first on Majestic Blog.

[ccw-atrib-link]

How to Rid Your Website of Six Common Google Analytics Headaches

Posted by amandaecking

I’ve been in and out of Google Analytics (GA) for the past five or so years agency-side. I’ve seen three different code libraries, dozens of new different features and reports roll out, IP addresses stop being reported, and keywords not-so-subtly phased out of the free platform.

Analytics has been a focus of mine for the past year or so—mainly, making sure clients get their data right. Right now, our new focus is closed loop tracking, but that’s a topic for another day. If you’re using Google Analytics, and only Google Analytics for the majority of your website stats, or it’s your primary vehicle for analysis, you need to make sure it’s accurate.

Not having data pulling in or reporting properly is like building a house on a shaky foundation: It doesn’t end well. Usually there are tears.

For some reason, a lot of people, including many of my clients, assume everything is tracking properly in Google Analytics… because Google. But it’s not Google who sets up your analytics. People do that. And people are prone to make mistakes.

I’m going to go through six scenarios where issues are commonly encountered with Google Analytics.

I’ll outline the remedy for each issue, and in the process, show you how to move forward with a diagnosis or resolution.

1. Self-referrals

This is probably one of the areas we’re all familiar with. If you’re seeing a lot of traffic from your own domain, there’s likely a problem somewhere—or you need to extend the default session length in Google Analytics. (For example, if you have a lot of long videos or music clips and don’t use event tracking; a website like TEDx or SoundCloud would be a good equivalent.)

Typically one of the first things I’ll do to help diagnose the problem is include an advanced filter to show the full referrer string. You do this by creating a filter, as shown below:

Filter Type: Custom filter > Advanced
Field A: Hostname
Extract A: (.*)
Field B: Request URI
Extract B: (.*)
Output To: Request URI
Constructor: $A1$B1

You’ll then start seeing the subdomains pulling in. Experience has shown me that if you have a separate subdomain hosted in another location (say, if you work with a separate company and they host and run your mobile site or your shopping cart), it gets treated by Google Analytics as a separate domain. Thus, you ‘ll need to implement cross domain tracking. This way, you can narrow down whether or not it’s one particular subdomain that’s creating the self-referrals.

In this example below, we can see all the revenue is being reported to the booking engine (which ended up being cross domain issues) and their own site is the fourth largest traffic source:

I’ll also a good idea to check the browser and device reports to start narrowing down whether the issue is specific to a particular element. If it’s not, keep digging. Look at pages pulling the self-referrals and go through the code with a fine-tooth comb, drilling down as much as you can.

2. Unusually low bounce rate

If you have a crazy-low bounce rate, it could be too good to be true. Unfortunately. An unusually low bounce rate could (and probably does) mean that at least on some pages of your website have the same Google Analytics tracking code installed twice.

Take a look at your source code, or use Google Tag Assistant (though it does have known bugs) to see if you’ve got GA tracking code installed twice.

While I tell clients having Google Analytics installed on the same page can lead to double the pageviews, I’ve not actually encountered that—I usually just say it to scare them into removing the duplicate implementation more quickly. Don’t tell on me.

3. Iframes anywhere

I’ve heard directly from Google engineers and Google Analytics evangelists that Google Analytics does not play well with iframes, and that it will never will play nice with this dinosaur technology.

If you track the iframe, you inflate your pageviews, plus you still aren’t tracking everything with 100% clarity.

If you don’t track across iframes, you lose the source/medium attribution and everything becomes a self-referral.

Damned if you do; damned if you don’t.

My advice: Stop using iframes. They’re Netscape-era technology anyway, with rainbow marquees and Comic Sans on top. Interestingly, and unfortunately, a number of booking engines (for hotels) and third-party carts (for ecommerce) still use iframes.

If you have any clients in those verticals, or if you’re in the vertical yourself, check with your provider to see if they use iframes. Or you can check for yourself, by right-clicking as close as you can to the actual booking element:

iframe-booking.png

There is no neat and tidy way to address iframes with Google Analytics, and usually iframes are not the only complicated element of setup you’ll encounter. I spent eight months dealing with a website on a subfolder, which used iframes and had a cross domain booking system, and the best visibility I was able to get was about 80% on a good day.

Typically, I’d approach diagnosing iframes (if, for some reason, I had absolutely no access to viewing a website or talking to the techs) similarly to diagnosing self-referrals, as self-referrals are one of the biggest symptoms of iframe use.

4. Massive traffic jumps

Massive jumps in traffic don’t typically just happen. (Unless, maybe, you’re Geraldine.) There’s always an explanation—a new campaign launched, you just turned on paid ads for the first time, you’re using content amplification platforms, you’re getting a ton of referrals from that recent press in The New York Times. And if you think it just happened, it’s probably a technical glitch.

I’ve seen everything from inflated pageviews result from including tracking on iframes and unnecessary implementation of virtual pageviews, to not realizing the tracking code was installed on other microsites for the same property. Oops.

Usually I’ve seen this happen when the tracking code was somewhere it shouldn’t be, so if you’re investigating a situation of this nature, first confirm the Google Analytics code is only in the places it needs to be.Tools like Google Tag Assistant and Screaming Frog can be your BFFs in helping you figure this out.

Also, I suggest bribing the IT department with sugar (or booze) to see if they’ve changed anything lately.

5. Cross-domain tracking

I wish cross-domain tracking with Google Analytics out of the box didn’t require any additional setup. But it does.

If you don’t have it set up properly, things break down quickly, and can be quite difficult to untangle.

The older the GA library you’re using, the harder it is. The easiest setup, by far, is Google Tag Manager with Universal Analytics. Hard-coded universal analytics is a bit more difficult because you have to implement autoLink manually and decorate forms, if you’re using them (and you probably are). Beyond that, rather than try and deal with it, I say update your Google Analytics code. Then we can talk.

Where I’ve seen the most murkiness with tracking is when parts of cross domain tracking are implemented, but not all. For some reason, if allowLinker isn’t included, or you forget to decorate all the forms, the cookies aren’t passed between domains.

The absolute first place I would start with this would be confirming the cookies are all passing properly at all the right points, forms, links, and smoke signals. I’ll usually use a combination of the Real Time report in Google Analytics, Google Tag Assistant, and GA debug to start testing this. Any debug tool you use will mean you’re playing in the console, so get friendly with it.

6. Internal use of UTM strings

I’ve saved the best for last. Internal use of campaign tagging. We may think, oh, I use Google to tag my campaigns externally, and we’ve got this new promotion on site which we’re using a banner ad for. That’s a campaign. Why don’t I tag it with a UTM string?

Step away from the keyboard now. Please.

When you tag internal links with UTM strings, you override the original source/medium. So that visitor who came in through your paid ad and then who clicks on the campaign banner has now been manually tagged. You lose the ability to track that they came through on the ad the moment they click on the tagged internal link. Their source and medium is now your internal campaign, not that paid ad you’re spending gobs of money on and have to justify to your manager. See the problem?

I’ve seen at least three pretty spectacular instances of this in the past year, and a number of smaller instances of it. Annie Cushing also talks about the evils of internal UTM tags and the odd prevalence of it. (Oh, and if you haven’t explored her blog, and the amazing spreadsheets she shares, please do.)

One clothing company I worked with tagged all of their homepage offers with UTM strings, which resulted in the loss of visibility for one-third of their audience: One million visits over the course of a year, and $2.1 million in lost revenue.

Let me say that again. One million visits, and $2.1 million. That couldn’t be attributed to an external source/campaign/spend.

Another client I audited included campaign tagging on nearly every navigational element on their website. It still gives me nightmares.

If you want to see if you have any internal UTM strings, head straight to the Campaigns report in Acquisition in Google Analytics, and look for anything like “home” or “navigation” or any language you may use internally to refer to your website structure.

And if you want to see how users are moving through your website, go to the Flow reports. Or if you really, really, really want to know how many people click on that sidebar link, use event tracking. But please, for the love of all things holy (and to keep us analytics lovers from throwing our computers across the room), stop using UTM tagging on your internal links.

Now breathe and smile

Odds are, your Google Analytics setup is fine. If you are seeing any of these issues, though, you have somewhere to start in diagnosing and addressing the data.

We’ve looked at six of the most common points of friction I’ve encountered with Google Analytics and how to start investigating them: self-referrals, bounce rate, iframes, traffic jumps, cross domain tracking and internal campaign tagging.

What common data integrity issues have you encountered with Google Analytics? What are your favorite tools to investigate?

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

5 Spreadsheet Tips for Manual Link Audits

Posted by MarieHaynes

Link auditing is the part of my job that I love the most. I have audited a LOT of links over the last few years. While there are some programs out there that can be quite helpful to the avid link auditor, I still prefer to create a spreadsheet of my links in Excel and then to audit those links one-by-one from within Google Spreadsheets. Over the years I have learned a few tricks and formulas that have helped me in this process. In this article, I will share several of these with you.

Please know that while I am quite comfortable being labelled a link auditing expert, I am not an Excel wizard. I am betting that some of the things that I am doing could be improved upon if you’re an advanced user. As such, if you have any suggestions or tips of your own I’d love to hear them in the comments section!

1. Extract the domain or subdomain from a URL

OK. You’ve downloaded links from as many sources as possible and now you want to manually visit and evaluate one link from every domain. But, holy moly, some of these domains can have THOUSANDS of links pointing to the site. So, let’s break these down so that you are just seeing one link from each domain. The first step is to extract the domain or subdomain from each url.

I am going to show you examples from a Google spreadsheet as I find that these display nicer for demonstration purposes. However, if you’ve got a fairly large site, you’ll find that the spreadsheets are easier to create in Excel. If you’re confused about any of these steps, check out the animated gif at the end of each step to see the process in action.

Here is how you extract a domain or subdomain from a url:

  • Create a new column to the left of your url column.
  • Use this formula:

    =LEFT(B1,FIND(“/”,B1,9)-1)

    What this will do is remove everything after the trailing slash following the domain name. http://www.example.com/article.html will now become http://www.example.com and http://www.subdomain.example.com/article.html will now become http://www.subdomain.example.com.

  • Copy our new column A and paste it right back where it was using the “paste as values” function. If you don’t do this, you won’t be able to use the Find and Replace feature.
  • Use Find and Replace to replace each of the following with a blank (i.e. nothing):
    http://
    https://
    www.

And BOOM! We are left with a column that contains just domain names and subdomain names. This animated gif shows each of the steps we just outlined:

2. Just show one link from each domain

The next step is to filter this list so that we are just seeing one link from each domain. If you are manually reviewing links, there’s usually no point in reviewing every single link from every domain. I will throw in a word of caution here though. Sometimes a domain can have both a good link and a bad link pointing to you. Or in some cases, you may find that links from one page are followed and from another page on the same site they are nofollowed. You can miss some of these by just looking at one link from each domain. Personally, I have some checks built in to my process where I use Scrapebox and some internal tools that I have created to make sure that I’m not missing the odd link by just looking at one link from each domain. For most link audits, however, you are not going to miss very much by assessing one link from each domain.

Here’s how we do it:

  • Highlight our domains column and sort the column in alphabetical order.
  • Create a column to the left of our domains, so that the domains are in column B.
  • Use this formula:

    =IF(B1=B2,”duplicate”,”unique”)

  • Copy that formula down the column.
  • Use the filter function so that you are just seeing the duplicates.
  • Delete those rows. Note: If you have tens of thousands of rows to delete, the spreadsheet may crash. A workaround here is to use “Clear Rows” instead of “Delete Rows” and then sort your domains column from A-Z once you are finished.

We’ve now got a list of one link from every domain linking to us.

Here’s the gif that shows each of these steps:

You may wonder why I didn’t use Excel’s dedupe function to simply deduplicate these entries. I have found that it doesn’t take much deduplication to crash Excel, which is why I do this step manually.

3. Finding patterns FTW!

Sometimes when you are auditing links, you’ll find that unnatural links have patterns. I LOVE when I see these, because sometimes I can quickly go through hundreds of links without having to check each one manually. Here is an example. Let’s say that your website has a bunch of spammy directory links. As you’re auditing you notice patterns such as one of these:

  • All of these directory links come from a url that contains …/computers/internet/item40682/
  • A whole bunch of spammy links that all come from a particular free subdomain like blogspot, wordpress, weebly, etc.
  • A lot of links that all contain a particular keyword for anchor text (this is assuming you’ve included anchor text in your spreadsheet when making it.)

You can quickly find all of these links and mark them as “disavow” or “keep” by doing the following:

  • Create a new column. In my example, I am going to create a new column in Column C and look for patterns in urls that are in Column B.
  • Use this formula:

    =FIND(“/item40682”,B1)
    (You would replace “item40682” with the phrase that you are looking for.)

  • Copy this formula down the column.
  • Filter your new column so that you are seeing any rows that have a number in this column. If the phrase doesn’t exist in that url, you’ll see “N/A”, and we can ignore those.
  • Now you can mark these all as disavow

4. Check your disavow file

This next tip is one that you can use to check your disavow file across your list of domains that you want to audit. The goal here is to see which links you have disavowed so that you don’t waste time reassessing them. This particular tip only works for checking links that you have disavowed on the domain level.

The first thing you’ll want to do is download your current disavow file from Google. For some strange reason, Google gives you the disavow file in CSV format. I have never understood this because they want you to upload the file in .txt. Still, I guess this is what works best for Google. All of your entries will be in column A of the CSV:

What we are going to do now is add these to a new sheet on our current spreadsheet and use a VLOOKUP function to mark which of our domains we have disavowed.

Here are the steps:

  • Create a new sheet on your current spreadsheet workbook.
  • Copy and paste column A from your disavow spreadsheet onto this new sheet. Or, alternatively, use the import function to import the entire CSV onto this sheet.
  • In B1, write “previously disavowed” and copy this down the entire column.
  • Remove the “domain:” from each of the entries by doing a Find and Replace to replace domain: with a blank.
  • Now go back to your link audit spreadsheet. If your domains are in column A and if you had, say, 1500 domains in your disavow file, your formula would look like this:

    =VLOOKUP(A1,Sheet2!$A$1:$B$1500,2,FALSE)

When you copy this formula down the spreadsheet, it will check each of your domains, and if it finds the domain in Sheet 2, it will write “previously disavowed” on our link audit spreadsheet.

Here is a gif that shows the process:

5. Make monthly or quarterly disavow work easier

That same formula described above is a great one to use if you are doing regular repeated link audits. In this case, your second sheet on your spreadsheet would contain domains that you have previously audited, and column B of this spreadsheet would say, “previously audited” rather than “previously disavowed“.

Your tips?

These are just a few of the formulas that you can use to help make link auditing work easier. But there are lots of other things you can do with Excel or Google Sheets to help speed up the process as well. If you have some tips to add, leave a comment below. Also, if you need clarification on any of these tips, I’m happy to answer questions in the comments section.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

Should I Rebrand and Redirect My Site? Should I Consolidate Multiple Sites/Brands? – Whiteboard Friday

Posted by randfish

Making changes to your brand is a huge step, and while it’s sometimes the best path forward, it isn’t one to be taken lightly. In today’s Whiteboard Friday, Rand offers some guidance to marketers who are wondering whether a rebrand/redirect is right for them, and also those who are considering consolidating multiple sites under a single brand.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

To rebrand, or not to rebrand, that is the question

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. Today we’re going to chat a little bit about whether you should rebrand and consider redirecting your existing website or websites and whether you should potentially consolidate multiple websites and brands that you may be running.

So we’ve talked before about redirection moves best practices. We’ve also talked about the splitting of link equity and domain authority and those kinds of things. But one of the questions that people have is, “Gosh, you know I have a website today and given the moves that Google has been making, that the social media world has been making, that content marketing has been making, I’m wondering whether I should potentially rebrand my site.” Lots of people bought domains back in the day that were exact match domains or partial match domains or that they thought reflected a move of the web toward or away from less brand-centric stuff and toward more keyword matching, topic matching, intent matching kinds of things.

Maybe you’re reconsidering those moves and you want to know, “Hey, should I be thinking about making a change now?” That’s what I’m here to answer. So this question to rebrand or not to re, it is tough because you know that when you do that rebrand, you will almost certainly take a traffic hit, and SEO is one of the biggest places where people typically take that traffic hit.

Moz previously was at SEOmoz.org and moved to moz.com. We saw a dip in our traffic over about 3 to 4 months before it fully recovered, and I would say that dip was between 15% and 25% of our search traffic, depending on week to week. I’ll link to a list of metrics that I put on my personal blog, Moz.com/rand, so that you can check those out if you’d like to see them. But it was a short recovery time for us.

One of the questions that people always have is, “Well wait, did you lose rankings for SEO since SEO used to be in your domain name?” The answer is no. In fact, six months after the move, we were ranking higher for SEO related terms and phrases.

Scenario A: Rebranding or redirecting scifitoysandgames.com

So let’s imagine that today you are running SciFiToysAndGames.com, which is right on the borderline. In my opinion, that’s right on the borderline of barely tolerable. Like it could be brandable, but it’s not great. I don’t love the “sci-fi” in here, partially because of how the Syfy channel, the entity that broadcasts stuff on television has chosen to delineate their spelling, sci-fi can be misinterpreted as to how it’s spelled. I don’t love having to have “and” in a domain name. This is long. All sorts of stuff.

Let’s say you also own StarToys.com, but you haven’t used it. Previously StarToys.com has been redirecting to SciFiToysAndGames.com, and you’re thinking, “Well, man, is it the right time to make this move? Should I make this change now? Should I wait for the future?”

How memorable or amplifiable is your current brand?

Well, these are the questions that I would urge you to consider. How memorable and amplifiable is your current brand? That’s something that if you are recognizing like, “Hey I think our brand name, in fact, is holding us back in search results and social media amplification, press, in blog mentions, in journalist links and these kinds of things,” well, that’s something serious to think about. Word of mouth too.

Will you maintain your current brand name long term?

So if you know that sometime in the next two, three, four, or five years you do want to move to StarToys, I would actually strongly urge you to do that right now, because the longer you wait, the longer it will take to build up the signals around the new domain and the more pain you’ll potentially incur by having to keep branding this and working on this old brand name. So I would strongly urge you, if you know you’re going to make the move eventually, make it today. Take the pain now, rather than more pain later.

Can or have you tested brand preference with your target audience?

I would urge you to find two different groups, one who are loyal customers today, people who know SciFiToysAndGames.com and have used it, and two, people who are potential customers, but aren’t yet familiar with it.

You don’t need to do big sample-sizes. If you can get 5, 10, or 15 people either in a room or talk to them in person, you can try some web surveys, you can try using some social media ads like things on Facebook. I’ve seen some companies do some testing around this. Even buying potential PPC ads and seeing how click-through rates perform and sentiment and those kinds of things, that is a great way to help validate your ideas, especially if you’re forced to bring data to a table by executives or other stakeholders.

How much traffic would you need in one year to justify a URL move?

The last thing I think about is imagine, and I want you to either imagine or even model this out, mathematically model it out. If your traffic growth rate — so let’s say you’re growing at 10% year-over-year right now — if that improved 1%, 5%, or 10% annually with a new brand name, would you make the move? So knowing that you might take a short-term hit, but then that your growth rate would be incrementally higher in years to come, how big would that growth rate need to be?

I would say that, in general, if I were thinking about these two domains, granted this is a hard case because you don’t know exactly how much more brandable or word-of-mouth-able or amplifiable your new one might be compared to your existing one. Well, gosh, my general thing here is if you think that’s going to be a substantive percentage, say 5% plus, almost always it’s worth it, because compound growth rate over a number of years will mean that you’re winning big time. Remember that that growth rate is different that raw growth. If you can incrementally increase your growth rate, you get tremendously more traffic when you look back two, three, four, or five years later.

Where does your current and future URL live on the domain/brand name spectrum?

I also made this domain name, brand name spectrum, because I wanted to try and visualize crappiness of domain name, brand name to really good domain name, brand name. I wanted to give some examples and then extract out some elements so that maybe you can start to build on these things thematically as you’re considering your own domains.

So from awful, we go to tolerable, good, and great. So Science-Fi-Toys.net is obviously terrible. I’ve taken a contraction of the name and the actual one. It’s got a .net. It’s using hyphens. It’s infinitely unmemorable up to what I think is tolerable — SciFiToysAndGames.com. It’s long. There are some questions about how type-in-able it is, how easy it is to type in. SciFiToys.com, which that’s pretty good. SciFiToys, relatively short, concise. It still has the “sci-fi” in there, but it’s a .com. We’re getting better. All the way up to, I really love the name, StarToys. I think it’s very brandable, very memorable. It’s concise. It’s easy to remember and type in. It has positive associations probably with most science fiction toy buyers who are familiar with at least “Star Wars” or “Star Trek.” It’s cool. It has some astronomy connotations too. Just a lot of good stuff going on with that domain name.

Then, another one, Region-Data-API.com. That sucks. NeighborhoodInfo.com. Okay, at least I know what it is. Neighborhood is a really hard name to type because it is very hard for many people to spell and remember. It’s long. I don’t totally love it. I don’t love the “info” connotation, which is generic-y.

DistrictData.com has a nice, alliterative ring to it. But maybe we could do even better and actually there is a company, WalkScore.com, which I think is wonderfully brandable and memorable and really describes what it is without being too in your face about the generic brand of we have regional data about places.

What if you’re doing mobile apps? BestAndroidApps.com. You might say, “Why is that in awful?” The answer is two things. One, it’s the length of the domain name and then the fact that you’re actually using someone else’s trademark in your name, which can be really risky. Especially if you start blowing up, getting big, Google might go and say, “Oh, do you have Android in your domain name? We’ll take that please. Thank you very much.”

BestApps.io, in the tech world, it’s very popular to use domains like .io or .ly. Unfortunately, I think once you venture outside of the high tech world, it’s really tough to get people to remember that that is a domain name. If you put up a billboard that says “BestApps.com,” a majority of people will go, “Oh, that’s a website.” But if you use .io, .ly, or one of the new domain names, .ninja, a lot of people won’t even know to connect that up with, “Oh, they mean an Internet website that I can type into my browser or look for.”

So we have to remember that we sometimes live in a bubble. Outside of that bubble are a lot of people who, if it’s not .com, questionable as to whether they’re even going to know what it is. Remember outside of the U.S., country code domain names work equally well — .co.uk, .ca, .co.za, wherever you are.

InstallThis.com. Now we’re getting better. Memorable, clear. Then all the way up to, I really like the name AppCritic.com. I have positive associations with like, “Oh year, restaurant critics, food critics, and movie critics, and this is an app critic. Great, that’s very cool.”

What are the things that are in here? Well, stuff at this end of the spectrum tends to be generic, forgettable, hard to type in. It’s long, brand-infringing, danger, danger, and sketchy sounding. It’s hard to quantify what sketchy sounding is, but you know it when you see it. When you’re reviewing domain names, you’re looking for links, you’re looking at things in the SERPs, you’re like, “Hmm, I don’t know about this one.” Having that sixth sense is something that we all develop over time, so sketchy sounding not quite as scientific as I might want for a description, but powerful.

On this end of the spectrum though, domain names and brand names tend to be unique, memorable, short. They use .com. Unfortunately, still the gold standard. Easy to type in, pronounceable. That’s a powerful thing too, especially because of word of mouth. We suffered with that for a long time with SEOmoz because many people saw it and thought, “Oh, ShowMoz, COMoz, SeeMoz.” It sucked. Have positive associations, like StarToys or WalkScore or AppCritic. They have these positive, pre-built-in associations psychologically that suggest something brandable.

Scenario B: Consolidating two sites

Scenario B, and then we’ll get to the end, but scenario B is the question like, “Should I consolidate?” Let’s say I’m running both of these today. Or more realistic and many times I see people like this, you’re running AppCritic.com and StarToys.com, and you think, “Boy, these are pretty separate.” But then you keep finding overlap between them. Your content tends to overlap, the audience tends to overlap. I find this with many, many folks who run multiple domains.

How much audience and content overlap is there?

So we’ve got to consider a few things. First off, that audience and content overlap. If you’ve got StarToys and AppCritic and the overlap is very thin, just that little, tiny piece in the middle there. The content doesn’t overlap much, the audience doesn’t overlap much. It probably doesn’t make that much sense.

But what if you’re finding like, “Gosh, man, we’re writing more and more about apps and tech and mobile and web stuff on StarToys, and we’re writing more and more about other kinds of geeky, fun things on AppCritic. Slowly it feels like these audiences are merging.” Well, now you might want to consider that consolidation.

Is there potential for separate sales or exits?

Second point of consideration, the potential for separate exits or sales. So if you know that you’re going to sell AppCritic.com to someone in the future and you want to make sure that’s separate from StarToys, you should keep them separate. If you think to yourself, “Gosh, I’d never sell one without the other. They’re really part of the same company, brand, effort,” well, I’d really consider that consolidation.

Will you dilute marketing or branding efforts?

Last point of positive consideration is dilution of marketing and branding efforts. Remember that you’re going to be working on marketing. You’re going to be working on branding. You’re going to be working on growing traffic to these. When you split your efforts, unless you have two relatively large, separate teams, this is very, very hard to do at the same rate that it could be done if you combined those efforts. So another big point of consideration. That compound growth rate that we talked about, that’s another big consideration with this.

Is the topical focus out of context?

What I don’t recommend you consider and what has been unfortunately considered, by a lot of folks in the SEO-centric world in the past, is topical focus of the content. I actually am crossing this out. Not a big consideration. You might say to yourself, “But Rand, we talked about previously on Whiteboard Friday how I can have topical authority around toys and games that are related to science fiction stuff, and I can have topical authority related to mobile apps.”

My answer is if the content overlap is strong and the audience overlap is strong, you can do both on one domain. You can see many, many examples of this across the web, Moz being a great example where we talk about startups and technology and sometimes venture capital and team building and broad marketing and paid search marketing and organic search marketing and just a ton of topics, but all serving the same audience and content. Because that overlap is strong, we can be an authority in all of these realms. Same goes for any time you’re considering these things.

All right everyone, hope you’ve enjoyed this edition of Whiteboard Friday. I look forward to some great comments, and we’ll see you again next week. take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

Spam Score: Moz’s New Metric to Measure Penalization Risk

Posted by randfish

Today, I’m very excited to announce that Moz’s Spam Score, an R&D project we’ve worked on for nearly a year, is finally going live. In this post, you can learn more about how we’re calculating spam score, what it means, and how you can potentially use it in your SEO work.

How does Spam Score work?

Over the last year, our data science team, led by 
Dr. Matt Peters, examined a great number of potential factors that predicted that a site might be penalized or banned by Google. We found strong correlations with 17 unique factors we call “spam flags,” and turned them into a score.

Almost every subdomain in 
Mozscape (our web index) now has a Spam Score attached to it, and this score is viewable inside Open Site Explorer (and soon, the MozBar and other tools). The score is simple; it just records the quantity of spam flags the subdomain triggers. Our correlations showed that no particular flag was more likely than others to mean a domain was penalized/banned in Google, but firing many flags had a very strong correlation (you can see the math below).

Spam Score currently operates only on the subdomain level—we don’t have it for pages or root domains. It’s been my experience and the experience of many other SEOs in the field that a great deal of link spam is tied to the subdomain-level. There are plenty of exceptions—manipulative links can and do live on plenty of high-quality sites—but as we’ve tested, we found that subdomain-level Spam Score was the best solution we could create at web scale. It does a solid job with the most obvious, nastiest spam, and a decent job highlighting risk in other areas, too.

How to access Spam Score

Right now, you can find Spam Score inside 
Open Site Explorer, both in the top metrics (just below domain/page authority) and in its own tab labeled “Spam Analysis.” Spam Score is only available for Pro subscribers right now, though in the future, we may make the score in the metrics section available to everyone (if you’re not a subscriber, you can check it out with a free trial). 

The current Spam Analysis page includes a list of subdomains or pages linking to your site. You can toggle the target to look at all links to a given subdomain on your site, given pages, or the entire root domain. You can further toggle source tier to look at the Spam Score for incoming linking pages or subdomains (but in the case of pages, we’re still showing the Spam Score for the subdomain on which that page is hosted).

You can click on any Spam Score row and see the details about which flags were triggered. We’ll bring you to a page like this:

Back on the original Spam Analysis page, at the very bottom of the rows, you’ll find an option to export a disavow file, which is compatible with Google Webmaster Tools. You can choose to filter the file to contain only those sites with a given spam flag count or higher:

Disavow exports usually take less than 3 hours to finish. We can send you an email when it’s ready, too.

WARNING: Please do not export this file and simply upload it to Google! You can really, really hurt your site’s ranking and there may be no way to recover. Instead, carefully sort through the links therein and make sure you really do want to disavow what’s in there. You can easily remove/edit the file to take out links you feel are not spam. When Moz’s Cyrus Shepard disavowed every link to his own site, it took more than a year for his rankings to return!

We’ve actually made the file not-wholly-ready for upload to Google in order to be sure folks aren’t too cavalier with this particular step. You’ll need to open it up and make some edits (specifically to lines at the top of the file) in order to ready it for Webmaster Tools

In the near future, we hope to have Spam Score in the Mozbar as well, which might look like this: 

Sweet, right? 🙂

Potential use cases for Spam Analysis

This list probably isn’t exhaustive, but these are a few of the ways we’ve been playing around with the data:

  1. Checking for spammy links to your own site: Almost every site has at least a few bad links pointing to it, but it’s been hard to know how much or how many potentially harmful links you might have until now. Run a quick spam analysis and see if there’s enough there to cause concern.
  2. Evaluating potential links: This is a big one where we think Spam Score can be helpful. It’s not going to catch every potentially bad link, and you should certainly still use your brain for evaluation too, but as you’re scanning a list of link opportunities or surfing to various sites, having the ability to see if they fire a lot of flags is a great warning sign.
  3. Link cleanup: Link cleanup projects can be messy, involved, precarious, and massively tedious. Spam Score might not catch everything, but sorting links by it can be hugely helpful in identifying potentially nasty stuff, and filtering out the more probably clean links.
  4. Disavow Files: Again, because Spam Score won’t perfectly catch everything, you will likely need to do some additional work here (especially if the site you’re working on has done some link buying on more generally trustworthy domains), but it can save you a heap of time evaluating and listing the worst and most obvious junk.

Over time, we’re also excited about using Spam Score to help improve the PA and DA calculations (it’s not currently in there), as well as adding it to other tools and data sources. We’d love your feedback and insight about where you’d most want to see Spam Score get involved.

Details about Spam Score’s calculation

This section comes courtesy of Moz’s head of data science, Dr. Matt Peters, who created the metric and deserves (at least in my humble opinion) a big round of applause. – Rand

Definition of “spam”

Before diving into the details of the individual spam flags and their calculation, it’s important to first describe our data gathering process and “spam” definition.

For our purposes, we followed Google’s definition of spam and gathered labels for a large number of sites as follows.

  • First, we randomly selected a large number of subdomains from the Mozscape index stratified by mozRank.
  • Then we crawled the subdomains and threw out any that didn’t return a “200 OK” (redirects, errors, etc).
  • Finally, we collected the top 10 de-personalized, geo-agnostic Google-US search results using the full subdomain name as the keyword and checked whether any of those results matched the original keyword. If they did not, we called the subdomain “spam,” otherwise we called it “ham.”

We performed the most recent data collection in November 2014 (after the Penguin 3.0 update) for about 500,000 subdomains.

Relationship between number of flags and spam

The overall Spam Score is currently an aggregate of 17 different “flags.” You can think of each flag a potential “warning sign” that signals that a site may be spammy. The overall likelihood of spam increases as a site accumulates more and more flags, so that the total number of flags is a strong predictor of spam. Accordingly, the flags are designed to be used together—no single flag, or even a few flags, is cause for concern (and indeed most sites will trigger at least a few flags).

The following table shows the relationship between the number of flags and percent of sites with those flags that we found Google had penalized or banned:

ABOVE: The overall probability of spam vs. the number of spam flags. Data collected in Nov. 2014 for approximately 500K subdomains. The table also highlights the three overall danger levels: low/green (< 10%) moderate/yellow (10-50%) and high/red (>50%)

The overall spam percent averaged across a large number of sites increases in lock step with the number of flags; however there are outliers in every category. For example, there are a small number of sites with very few flags that are tagged as spam by Google and conversely a small number of sites with many flags that are not spam.

Spam flag details

The individual spam flags capture a wide range of spam signals link profiles, anchor text, on page signals and properties of the domain name. At a high level the process to determine the spam flags for each subdomain is:

  • Collect link metrics from Mozscape (mozRank, mozTrust, number of linking domains, etc).
  • Collect anchor text metrics from Mozscape (top anchor text phrases sorted by number of links)
  • Collect the top five pages by Page Authority on the subdomain from Mozscape
  • Crawl the top five pages plus the home page and process to extract on page signals
  • Provide the output for Mozscape to include in the next index release cycle

Since the spam flags are incorporated into in the Mozscape index, fresh data is released with each new index. Right now, we crawl and process the spam flags for each subdomains every two – three months although this may change in the future.

Link flags

The following table lists the link and anchor text related flags with the the odds ratio for each flag. For each flag, we can compute two percents: the percent of sites with that flag that are penalized by Google and the percent of sites with that flag that were not penalized. The odds ratio is the ratio of these percents and gives the increase in likelihood that a site is spam if it has the flag. For example, the first row says that a site with this flag is 12.4 times more likely to be spam than one without the flag.

ABOVE: Description and odds ratio of link and anchor text related spam flags. In addition to a description, it lists the odds ratio for each flag which gives the overall increase in spam likelihood if the flag is present).

Working down the table, the flags are:

  • Low mozTrust to mozRank ratio: Sites with low mozTrust compared to mozRank are likely to be spam.
  • Large site with few links: Large sites with many pages tend to also have many links and large sites without a corresponding large number of links are likely to be spam.
  • Site link diversity is low: If a large percentage of links to a site are from a few domains it is likely to be spam.
  • Ratio of followed to nofollowed subdomains/domains (two separate flags): Sites with a large number of followed links relative to nofollowed are likely to be spam.
  • Small proportion of branded links (anchor text): Organically occurring links tend to contain a disproportionate amount of banded keywords. If a site does not have a lot of branded anchor text, it’s a signal the links are not organic.

On-page flags

Similar to the link flags, the following table lists the on page and domain name related flags:

ABOVE: Description and odds ratio of on page and domain name related spam flags. In addition to a description, it lists the odds ratio for each flag which gives the overall increase in spam likelihood if the flag is present).

  • Thin content: If a site has a relatively small ratio of content to navigation chrome it’s likely to be spam.
  • Site mark-up is abnormally small: Non-spam sites tend to invest in rich user experiences with CSS, Javascript and extensive mark-up. Accordingly, a large ratio of text to mark-up is a spam signal.
  • Large number of external links: A site with a large number of external links may look spammy.
  • Low number of internal links: Real sites tend to link heavily to themselves via internal navigation and a relative lack of internal links is a spam signal.
  • Anchor text-heavy page: Sites with a lot of anchor text are more likely to be spam then those with more content and less links.
  • External links in navigation: Spam sites may hide external links in the sidebar or footer.
  • No contact info: Real sites prominently display their social and other contact information.
  • Low number of pages found: A site with only one or a few pages is more likely to be spam than one with many pages.
  • TLD correlated with spam domains: Certain TLDs are more spammy than others (e.g. pw).
  • Domain name length: A long subdomain name like “bycheapviagra.freeshipping.onlinepharmacy.com” may indicate keyword stuffing.
  • Domain name contains numerals: domain names with numerals may be automatically generated and therefore spam.

If you’d like some more details on the technical aspects of the spam score, check out the 
video of Matt’s 2012 MozCon talk about Algorithmic Spam Detection or the slides (many of the details have evolved, but the overall ideas are the same):

We’d love your feedback

As with all metrics, Spam Score won’t be perfect. We’d love to hear your feedback and ideas for improving the score as well as what you’d like to see from it’s in-product application in the future. Feel free to leave comments on this post, or to email Matt (matt at moz dot com) and me (rand at moz dot com) privately with any suggestions.

Good luck cleaning up and preventing link spam!



Not a Pro Subscriber? No problem!



Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

What Deep Learning and Machine Learning Mean For the Future of SEO – Whiteboard Friday

Posted by randfish

Imagine a world where even the high-up Google engineers don’t know what’s in the ranking algorithm. We may be moving in that direction. In today’s Whiteboard Friday, Rand explores and explains the concepts of deep learning and machine learning, drawing us a picture of how they could impact our work as SEOs.

For reference, here’s a still of this week’s whiteboard!

Video transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we are going to take a peek into Google’s future and look at what it could mean as Google advances their machine learning and deep learning capabilities. I know these sound like big, fancy, important words. They’re not actually that tough of topics to understand. In fact, they’re simplistic enough that even a lot of technology firms like Moz do some level of machine learning. We don’t do anything with deep learning and a lot of neural networks. We might be going that direction.

But I found an article that was published in January, absolutely fascinating and I think really worth reading, and I wanted to extract some of the contents here for Whiteboard Friday because I do think this is tactically and strategically important to understand for SEOs and really important for us to understand so that we can explain to our bosses, our teams, our clients how SEO works and will work in the future.

The article is called “Google Search Will Be Your Next Brain.” It’s by Steve Levy. It’s over on Medium. I do encourage you to read it. It’s a relatively lengthy read, but just a fascinating one if you’re interested in search. It starts with a profile of Geoff Hinton, who was a professor in Canada and worked on neural networks for a long time and then came over to Google and is now a distinguished engineer there. As the article says, a quote from the article: “He is versed in the black art of organizing several layers of artificial neurons so that the entire system, the system of neurons, could be trained or even train itself to divine coherence from random inputs.”

This sounds complex, but basically what we’re saying is we’re trying to get machines to come up with outcomes on their own rather than us having to tell them all the inputs to consider and how to process those incomes and the outcome to spit out. So this is essentially machine learning. Google has used this, for example, to figure out when you give it a bunch of photos and it can say, “Oh, this is a landscape photo. Oh, this is an outdoor photo. Oh, this is a photo of a person.” Have you ever had that creepy experience where you upload a photo to Facebook or to Google+ and they say, “Is this your friend so and so?” And you’re like, “God, that’s a terrible shot of my friend. You can barely see most of his face, and he’s wearing glasses which he usually never wears. How in the world could Google+ or Facebook figure out that this is this person?”

That’s what they use, these neural networks, these deep machine learning processes for. So I’ll give you a simple example. Here at MOZ, we do machine learning very simplistically for page authority and domain authority. We take all the inputs — numbers of links, number of linking root domains, every single metric that you could get from MOZ on the page level, on the sub-domain level, on the root-domain level, all these metrics — and then we combine them together and we say, “Hey machine, we want you to build us the algorithm that best correlates with how Google ranks pages, and here’s a bunch of pages that Google has ranked.” I think we use a base set of 10,000, and we do it about quarterly or every 6 months, feed that back into the system and the system pumps out the little algorithm that says, “Here you go. This will give you the best correlating metric with how Google ranks pages.” That’s how you get page authority domain authority.

Cool, really useful, helpful for us to say like, “Okay, this page is probably considered a little more important than this page by Google, and this one a lot more important.” Very cool. But it’s not a particularly advanced system. The more advanced system is to have these kinds of neural nets in layers. So you have a set of networks, and these neural networks, by the way, they’re designed to replicate nodes in the human brain, which is in my opinion a little creepy, but don’t worry. The article does talk about how there’s a board of scientists who make sure Terminator 2 doesn’t happen, or Terminator 1 for that matter. Apparently, no one’s stopping Terminator 4 from happening? That’s the new one that’s coming out.

So one layer of the neural net will identify features. Another layer of the neural net might classify the types of features that are coming in. Imagine this for search results. Search results are coming in, and Google’s looking at the features of all the websites and web pages, your websites and pages, to try and consider like, “What are the elements I could pull out from there?”

Well, there’s the link data about it, and there are things that happen on the page. There are user interactions and all sorts of stuff. Then we’re going to classify types of pages, types of searches, and then we’re going to extract the features or metrics that predict the desired result, that a user gets a search result they really like. We have an algorithm that can consistently produce those, and then neural networks are hopefully designed — that’s what Geoff Hinton has been working on — to train themselves to get better. So it’s not like with PA and DA, our data scientist Matt Peters and his team looking at it and going, “I bet we could make this better by doing this.”

This is standing back and the guys at Google just going, “All right machine, you learn.” They figure it out. It’s kind of creepy, right?

In the original system, you needed those people, these individuals here to feed the inputs, to say like, “This is what you can consider, system, and the features that we want you to extract from it.”

Then unsupervised learning, which is kind of this next step, the system figures it out. So this takes us to some interesting places. Imagine the Google algorithm, circa 2005. You had basically a bunch of things in here. Maybe you’d have anchor text, PageRank and you’d have some measure of authority on a domain level. Maybe there are people who are tossing new stuff in there like, “Hey algorithm, let’s consider the location of the searcher. Hey algorithm, let’s consider some user and usage data.” They’re tossing new things into the bucket that the algorithm might consider, and then they’re measuring it, seeing if it improves.

But you get to the algorithm today, and gosh there are going to be a lot of things in there that are driven by machine learning, if not deep learning yet. So there are derivatives of all of these metrics. There are conglomerations of them. There are extracted pieces like, “Hey, we only ant to look and measure anchor text on these types of results when we also see that the anchor text matches up to the search queries that have previously been performed by people who also search for this.” What does that even mean? But that’s what the algorithm is designed to do. The machine learning system figures out things that humans would never extract, metrics that we would never even create from the inputs that they can see.

Then, over time, the idea is that in the future even the inputs aren’t given by human beings. The machine is getting to figure this stuff out itself. That’s weird. That means that if you were to ask a Google engineer in a world where deep learning controls the ranking algorithm, if you were to ask the people who designed the ranking system, “Hey, does it matter if I get more links,” they might be like, “Well, maybe.” But they don’t know, because they don’t know what’s in this algorithm. Only the machine knows, and the machine can’t even really explain it. You could go take a snapshot and look at it, but (a) it’s constantly evolving, and (b) a lot of these metrics are going to be weird conglomerations and derivatives of a bunch of metrics mashed together and torn apart and considered only when certain criteria are fulfilled. Yikes.

So what does that mean for SEOs. Like what do we have to care about from all of these systems and this evolution and this move towards deep learning, which by the way that’s what Jeff Dean, who is, I think, a senior fellow over at Google, he’s the dude that everyone mocks for being the world’s smartest computer scientist over there, and Jeff Dean has basically said, “Hey, we want to put this into search. It’s not there yet, but we want to take these models, these things that Hinton has built, and we want to put them into search.” That for SEOs in the future is going to mean much less distinct universal ranking inputs, ranking factors. We won’t really have ranking factors in the way that we know them today. It won’t be like, “Well, they have more anchor text and so they rank higher.” That might be something we’d still look at and we’d say, “Hey, they have this anchor text. Maybe that’s correlated with what the machine is finding, the system is finding to be useful, and that’s still something I want to care about to a certain extent.”

But we’re going to have to consider those things a lot more seriously. We’re going to have to take another look at them and decide and determine whether the things that we thought were ranking factors still are when the neural network system takes over. It also is going to mean something that I think many, many SEOs have been predicting for a long time and have been working towards, which is more success for websites that satisfy searchers. If the output is successful searches, and that’ s what the system is looking for, and that’s what it’s trying to correlate all its metrics to, if you produce something that means more successful searches for Google searchers when they get to your site, and you ranking in the top means Google searchers are happier, well you know what? The algorithm will catch up to you. That’s kind of a nice thing. It does mean a lot less info from Google about how they rank results.

So today you might hear from someone at Google, “Well, page speed is a very small ranking factor.” In the future they might be, “Well, page speed is like all ranking factors, totally unknown to us.” Because the machine might say, “Well yeah, page speed as a distinct metric, one that a Google engineer could actually look at, looks very small.” But derivatives of things that are connected to page speed may be huge inputs. Maybe page speed is something, that across all of these, is very well connected with happier searchers and successful search results. Weird things that we never thought of before might be connected with them as the machine learning system tries to build all those correlations, and that means potentially many more inputs into the ranking algorithm, things that we would never consider today, things we might consider wholly illogical, like, “What servers do you run on?” Well, that seems ridiculous. Why would Google ever grade you on that?

If human beings are putting factors into the algorithm, they never would. But the neural network doesn’t care. It doesn’t care. It’s a honey badger. It doesn’t care what inputs it collects. It only cares about successful searches, and so if it turns out that Ubuntu is poorly correlated with successful search results, too bad.

This world is not here yet today, but certainly there are elements of it. Google has talked about how Panda and Penguin are based off of machine learning systems like this. I think, given what Geoff Hinton and Jeff Dean are working on at Google, it sounds like this will be making its way more seriously into search and therefore it’s something that we’re really going to have to consider as search marketers.

All right everyone, I hope you’ll join me again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

Long Tail CTR Study: The Forgotten Traffic Beyond Top 10 Rankings

Posted by GaryMoyle

Search behavior is fundamentally changing, as users become more savvy and increasingly familiar with search technology. Google’s results have also changed significantly over the last decade, going from a simple page of 10 blue links to a much richer layout, including videos, images, shopping ads and the innovative Knowledge Graph.

We also know there are an increasing amount of touchpoints in a customer journey involving different channels and devices. Google’s
Zero Moment of Truth theory (ZMOT), which describes a revolution in the way consumers search for information online, supports this idea and predicts that we can expect the number of times natural search is involved on the path to a conversion to get higher and higher.

Understanding how people interact with Google and other search engines will always be important. Organic click curves show how many clicks you might expect from search engine results and are one way of evaluating the impact of our campaigns, forecasting performance and exploring changing search behavior.

Using search query data from Google UK for a wide range of leading brands based on millions of impressions and clicks, we can gain insights into the how CTR in natural search has evolved beyond those shown in previous studies by
Catalyst, Slingshot and AOL.

Our methodology

The NetBooster study is based entirely on UK top search query data and has been refined by day in order to give us the most accurate sample size possible. This helped us reduce anomalies in the data in order to achieve the most reliable click curve possible, allowing us to extend it way beyond the traditional top 10 results.

We developed a method to extract data day by day to greatly increase the volume of keywords and to help improve the accuracy of the
average ranking position. It ensured that the average was taken across the shortest timescale possible, reducing rounding errors.

The NetBooster study included:

  • 65,446,308 (65 million) clicks
  • 311,278,379 (311 million) impressions
  • 1,253,130 (1.2 million) unique search queries
  • 54 unique brands
  • 11 household brands (sites with a total of 1M+ branded keyword impressions)
  • Data covers several verticals including retail, travel and financial

We also looked at organic CTR for mobile, video and image results to better understand how people are discovering content in natural search across multiple devices and channels. 

We’ll explore some of the most important elements in this article.

How does our study compare against others?

Let’s start by looking at the top 10 results. In the graph below we have normalized the results in order to compare our curve, like-for-like, with previous studies from Catalyst and Slingshot. Straight away we can see that there is higher participation beyond the top four positions when compared to other studies. We can also see much higher CTR for positions lower on the pages, which highlights how searchers are becoming more comfortable with mining search results.

A new click curve to rule them all

Our first click curve is the most useful, as it provides the click through rates for generic non-brand search queries across positions 1 to 30. Initially, we can see a significant amount of traffic going to the top three results with position No. 1 receiving 19% of total traffic, 15% at position No. 2 and 11.45% at position No. 3. The interesting thing to note, however, is our curve shows a relatively high CTR for positions typically below the fold. Positions 6-10 all received a higher CTR than shown in previous studies. It also demonstrates that searchers are frequently exploring pages two and three.

CTR-top-30-730px.jpg

When we look beyond the top 10, we can see that CTR is also higher than anticipated, with positions 11-20 accounting for 17% of total traffic. Positions 21-30 also show higher than anticipated results, with over 5% of total traffic coming from page three. This gives us a better understanding of the potential uplift in visits when improving rankings from positions 11-30.

This highlights that searchers are frequently going beyond the top 10 to find the exact result they want. The prominence of paid advertising, shopping ads, Knowledge Graph and the OneBox may also be pushing users below the fold more often as users attempt to find better qualified results. It may also indicate growing dissatisfaction with Google results, although this is a little harder to quantify.

Of course, it’s important we don’t just rely on one single click curve. Not all searches are equal. What about the influence of brand, mobile and long-tail searches?

Brand bias has a significant influence on CTR

One thing we particularly wanted to explore was how the size of your brand influences the curve. To explore this, we banded each of the domains in our study into small, medium and large categories based on the sum of brand query impressions across the entire duration of the study.

small-medium-large-brand-organic-ctr-730

When we look at how brand bias is influencing CTR for non-branded search queries, we can see that better known brands get a sizable increase in CTR. More importantly, small- to medium-size brands are actually losing out to results from these better-known brands and experience a much lower CTR in comparison.

What is clear is keyphrase strategy will be important for smaller brands in order to gain traction in natural search. Identifying and targeting valuable search queries that aren’t already dominated by major brands will minimize the cannibalization of CTR and ensure higher traffic levels as a result.

How does mobile CTR reflect changing search behavior?

Mobile search has become a huge part of our daily lives, and our clients are seeing a substantial shift in natural search traffic from desktop to mobile devices. According to Google, 30% of all searches made in 2013 were on a mobile device; they also predict mobile searches will constitute over 50% of all searches in 2014.

Understanding CTR from mobile devices will be vital as the mobile search revolution continues. It was interesting to see that the click curve remained very similar to our desktop curve. Despite the lack of screen real estate, searchers are clearly motivated to scroll below the fold and beyond the top 10.

netbooster-mobile-organic-ctr-730px.jpg

NetBooster CTR curves for top 30 organic positions


Position

Desktop CTR

Mobile CTR

Large Brand

Medium Brand

Small Brand
1 19.35% 20.28% 20.84% 13.32% 8.59%
2 15.09% 16.59% 16.25% 9.77% 8.92%
3 11.45% 13.36% 12.61% 7.64% 7.17%
4 8.68% 10.70% 9.91% 5.50% 6.19%
5 7.21% 7.97% 8.08% 4.69% 5.37%
6 5.85% 6.38% 6.55% 4.07% 4.17%
7 4.63% 4.85% 5.20% 3.33% 3.70%
8 3.93% 3.90% 4.40% 2.96% 3.22%
9 3.35% 3.15% 3.76% 2.62% 3.05%
10 2.82% 2.59% 3.13% 2.25% 2.82%
11 3.06% 3.18% 3.59% 2.72% 1.94%
12 2.36% 3.62% 2.93% 1.96% 1.31%
13 2.16% 4.13% 2.78% 1.96% 1.26%
14 1.87% 3.37% 2.52% 1.68% 0.92%
15 1.79% 3.26% 2.43% 1.51% 1.04%
16 1.52% 2.68% 2.02% 1.26% 0.89%
17 1.30% 2.79% 1.67% 1.20% 0.71%
18 1.26% 2.13% 1.59% 1.16% 0.86%
19 1.16% 1.80% 1.43% 1.12% 0.82%
20 1.05% 1.51% 1.36% 0.86% 0.73%
21 0.86% 2.04% 1.15% 0.74% 0.70%
22 0.75% 2.25% 1.02% 0.68% 0.46%
23 0.68% 2.13% 0.91% 0.62% 0.42%
24 0.63% 1.84% 0.81% 0.63% 0.45%
25 0.56% 2.05% 0.71% 0.61% 0.35%
26 0.51% 1.85% 0.59% 0.63% 0.34%
27 0.49% 1.08% 0.74% 0.42% 0.24%
28 0.45% 1.55% 0.58% 0.49% 0.24%
29 0.44% 1.07% 0.51% 0.53% 0.28%
30 0.36% 1.21% 0.47% 0.38% 0.26%

Creating your own click curve

This study will give you a set of benchmarks for both non-branded and branded click-through rates with which you can confidently compare to your own click curve data. Using this data as a comparison will let you understand whether the appearance of your content is working for or against you.

We have made things a little easier for you by creating an Excel spreadsheet: simply drop your own top search query data in and it’ll automatically create a click curve for your website.

Simply visit the NetBooster website and download our tool to start making your own click curve.

In conclusion

It’s been both a fascinating and rewarding study, and we can clearly see a change in search habits. Whatever the reasons for this evolving search behavior, we need to start thinking beyond the top 10, as pages two and three are likely to get more traffic in future. 

 We also need to maximize the traffic created from existing rankings and not just think about position.

Most importantly, we can see practical applications of this data for anyone looking to understand and maximize their content’s performance in natural search. Having the ability to quickly and easily create your own click curve and compare this against a set of benchmarks means you can now understand whether you have an optimal CTR.

What could be the next steps?

There is, however, plenty of scope for improvement. We are looking forward to continuing our investigation, tracking the evolution of search behavior. If you’d like to explore this subject further, here are a few ideas:

  • Segment search queries by intent (How does CTR vary depending on whether a search query is commercial or informational?)
  • Understand CTR by industry or niche
  • Monitor the effect of new Knowledge Graph formats on CTR across both desktop and mobile search
  • Conduct an annual analysis of search behavior (Are people’s search habits changing? Are they clicking on more results? Are they mining further into Google’s results?)

Ultimately, click curves like this will change as the underlying search behavior continues to evolve. We are now seeing a massive shift in the underlying search technology, with Google in particular heavily investing in entity- based search (i.e., the Knowledge Graph). We can expect other search engines, such as Bing, Yandex and Baidu to follow suit and use a similar approach.

The rise of smartphone adoption and constant connectivity also means natural search is becoming more focused on mobile devices. Voice-activated search is also a game-changer, as people start to converse with search engines in a more natural way. This has huge implications for how we monitor search activity.

What is clear is no other industry is changing as rapidly as search. Understanding how we all interact with new forms of search results will be a crucial part of measuring and creating success.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]