Why Effective, Modern SEO Requires Technical, Creative, and Strategic Thinking – Whiteboard Friday

Posted by randfish

There’s no doubt that quite a bit has changed about SEO, and that the field is far more integrated with other aspects of online marketing than it once was. In today’s Whiteboard Friday, Rand pushes back against the idea that effective modern SEO doesn’t require any technical expertise, outlining a fantastic list of technical elements that today’s SEOs need to know about in order to be truly effective.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

Video transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week I’m going to do something unusual. I don’t usually point out these inconsistencies or sort of take issue with other folks’ content on the web, because I generally find that that’s not all that valuable and useful. But I’m going to make an exception here.

There is an article by Jayson DeMers, who I think might actually be here in Seattle — maybe he and I can hang out at some point — called “Why Modern SEO Requires Almost No Technical Expertise.” It was an article that got a shocking amount of traction and attention. On Facebook, it has thousands of shares. On LinkedIn, it did really well. On Twitter, it got a bunch of attention.

Some folks in the SEO world have already pointed out some issues around this. But because of the increasing popularity of this article, and because I think there’s, like, this hopefulness from worlds outside of kind of the hardcore SEO world that are looking to this piece and going, “Look, this is great. We don’t have to be technical. We don’t have to worry about technical things in order to do SEO.”

Look, I completely get the appeal of that. I did want to point out some of the reasons why this is not so accurate. At the same time, I don’t want to rain on Jayson, because I think that it’s very possible he’s writing an article for Entrepreneur, maybe he has sort of a commitment to them. Maybe he had no idea that this article was going to spark so much attention and investment. He does make some good points. I think it’s just really the title and then some of the messages inside there that I take strong issue with, and so I wanted to bring those up.

First off, some of the good points he did bring up.

One, he wisely says, “You don’t need to know how to code or to write and read algorithms in order to do SEO.” I totally agree with that. If today you’re looking at SEO and you’re thinking, “Well, am I going to get more into this subject? Am I going to try investing in SEO? But I don’t even know HTML and CSS yet.”

Those are good skills to have, and they will help you in SEO, but you don’t need them. Jayson’s totally right. You don’t have to have them, and you can learn and pick up some of these things, and do searches, watch some Whiteboard Fridays, check out some guides, and pick up a lot of that stuff later on as you need it in your career. SEO doesn’t have that hard requirement.

And secondly, he makes an intelligent point that we’ve made many times here at Moz, which is that, broadly speaking, a better user experience is well correlated with better rankings.

You make a great website that delivers great user experience, that provides the answers to searchers’ questions and gives them extraordinarily good content, way better than what’s out there already in the search results, generally speaking you’re going to see happy searchers, and that’s going to lead to higher rankings.

But not entirely. There are a lot of other elements that go in here. So I’ll bring up some frustrating points around the piece as well.

First off, there’s no acknowledgment — and I find this a little disturbing — that the ability to read and write code, or even HTML and CSS, which I think are the basic place to start, is helpful or can take your SEO efforts to the next level. I think both of those things are true.

So being able to look at a web page, view source on it, or pull up Firebug in Firefox or something and diagnose what’s going on and then go, “Oh, that’s why Google is not able to see this content. That’s why we’re not ranking for this keyword or term, or why even when I enter this exact sentence in quotes into Google, which is on our page, this is why it’s not bringing it up. It’s because it’s loading it after the page from a remote file that Google can’t access.” These are technical things, and being able to see how that code is built, how it’s structured, and what’s going on there, very, very helpful.

Some coding knowledge also can take your SEO efforts even further. I mean, so many times, SEOs are stymied by the conversations that we have with our programmers and our developers and the technical staff on our teams. When we can have those conversations intelligently, because at least we understand the principles of how an if-then statement works, or what software engineering best practices are being used, or they can upload something into a GitHub repository, and we can take a look at it there, that kind of stuff is really helpful.

Secondly, I don’t like that the article overly reduces all of this information that we have about what we’ve learned about Google. So he mentions two sources. One is things that Google tells us, and others are SEO experiments. I think both of those are true. Although I’d add that there’s sort of a sixth sense of knowledge that we gain over time from looking at many, many search results and kind of having this feel for why things rank, and what might be wrong with a site, and getting really good at that using tools and data as well. There are people who can look at Open Site Explorer and then go, “Aha, I bet this is going to happen.” They can look, and 90% of the time they’re right.

So he boils this down to, one, write quality content, and two, reduce your bounce rate. Neither of those things are wrong. You should write quality content, although I’d argue there are lots of other forms of quality content that aren’t necessarily written — video, images and graphics, podcasts, lots of other stuff.

And secondly, that just doing those two things is not always enough. So you can see, like many, many folks look and go, “I have quality content. It has a low bounce rate. How come I don’t rank better?” Well, your competitors, they’re also going to have quality content with a low bounce rate. That’s not a very high bar.

Also, frustratingly, this really gets in my craw. I don’t think “write quality content” means anything. You tell me. When you hear that, to me that is a totally non-actionable, non-useful phrase that’s a piece of advice that is so generic as to be discardable. So I really wish that there was more substance behind that.

The article also makes, in my opinion, the totally inaccurate claim that modern SEO really is reduced to “the happier your users are when they visit your site, the higher you’re going to rank.”

Wow. Okay. Again, I think broadly these things are correlated. User happiness and rank is broadly correlated, but it’s not a one to one. This is not like a, “Oh, well, that’s a 1.0 correlation.”

I would guess that the correlation is probably closer to like the page authority range. I bet it’s like 0.35 or something correlation. If you were to actually measure this broadly across the web and say like, “Hey, were you happier with result one, two, three, four, or five,” the ordering would not be perfect at all. It probably wouldn’t even be close.

There’s a ton of reasons why sometimes someone who ranks on Page 2 or Page 3 or doesn’t rank at all for a query is doing a better piece of content than the person who does rank well or ranks on Page 1, Position 1.

Then the article suggests five and sort of a half steps to successful modern SEO, which I think is a really incomplete list. So Jayson gives us;

  • Good on-site experience
  • Writing good content
  • Getting others to acknowledge you as an authority
  • Rising in social popularity
  • Earning local relevance
  • Dealing with modern CMS systems (which he notes most modern CMS systems are SEO-friendly)

The thing is there’s nothing actually wrong with any of these. They’re all, generally speaking, correct, either directly or indirectly related to SEO. The one about local relevance, I have some issue with, because he doesn’t note that there’s a separate algorithm for sort of how local SEO is done and how Google ranks local sites in maps and in their local search results. Also not noted is that rising in social popularity won’t necessarily directly help your SEO, although it can have indirect and positive benefits.

I feel like this list is super incomplete. Okay, I brainstormed just off the top of my head in the 10 minutes before we filmed this video a list. The list was so long that, as you can see, I filled up the whole whiteboard and then didn’t have any more room. I’m not going to bother to erase and go try and be absolutely complete.

But there’s a huge, huge number of things that are important, critically important for technical SEO. If you don’t know how to do these things, you are sunk in many cases. You can’t be an effective SEO analyst, or consultant, or in-house team member, because you simply can’t diagnose the potential problems, rectify those potential problems, identify strategies that your competitors are using, be able to diagnose a traffic gain or loss. You have to have these skills in order to do that.

I’ll run through these quickly, but really the idea is just that this list is so huge and so long that I think it’s very, very, very wrong to say technical SEO is behind us. I almost feel like the opposite is true.

We have to be able to understand things like;

  • Content rendering and indexability
  • Crawl structure, internal links, JavaScript, Ajax. If something’s post-loading after the page and Google’s not able to index it, or there are links that are accessible via JavaScript or Ajax, maybe Google can’t necessarily see those or isn’t crawling them as effectively, or is crawling them, but isn’t assigning them as much link weight as they might be assigning other stuff, and you’ve made it tough to link to them externally, and so they can’t crawl it.
  • Disabling crawling and/or indexing of thin or incomplete or non-search-targeted content. We have a bunch of search results pages. Should we use rel=prev/next? Should we robots.txt those out? Should we disallow from crawling with meta robots? Should we rel=canonical them to other pages? Should we exclude them via the protocols inside Google Webmaster Tools, which is now Google Search Console?
  • Managing redirects, domain migrations, content updates. A new piece of content comes out, replacing an old piece of content, what do we do with that old piece of content? What’s the best practice? It varies by different things. We have a whole Whiteboard Friday about the different things that you could do with that. What about a big redirect or a domain migration? You buy another company and you’re redirecting their site to your site. You have to understand things about subdomain structures versus subfolders, which, again, we’ve done another Whiteboard Friday about that.
  • Proper error codes, downtime procedures, and not found pages. If your 404 pages turn out to all be 200 pages, well, now you’ve made a big error there, and Google could be crawling tons of 404 pages that they think are real pages, because you’ve made it a status code 200, or you’ve used a 404 code when you should have used a 410, which is a permanently removed, to be able to get it completely out of the indexes, as opposed to having Google revisit it and keep it in the index.

Downtime procedures. So there’s specifically a… I can’t even remember. It’s a 5xx code that you can use. Maybe it was a 503 or something that you can use that’s like, “Revisit later. We’re having some downtime right now.” Google urges you to use that specific code rather than using a 404, which tells them, “This page is now an error.”

Disney had that problem a while ago, if you guys remember, where they 404ed all their pages during an hour of downtime, and then their homepage, when you searched for Disney World, was, like, “Not found.” Oh, jeez, Disney World, not so good.

  • International and multi-language targeting issues. I won’t go into that. But you have to know the protocols there. Duplicate content, syndication, scrapers. How do we handle all that? Somebody else wants to take our content, put it on their site, what should we do? Someone’s scraping our content. What can we do? We have duplicate content on our own site. What should we do?
  • Diagnosing traffic drops via analytics and metrics. Being able to look at a rankings report, being able to look at analytics connecting those up and trying to see: Why did we go up or down? Did we have less pages being indexed, more pages being indexed, more pages getting traffic less, more keywords less?
  • Understanding advanced search parameters. Today, just today, I was checking out the related parameter in Google, which is fascinating for most sites. Well, for Moz, weirdly, related:oursite.com shows nothing. But for virtually every other sit, well, most other sites on the web, it does show some really interesting data, and you can see how Google is connecting up, essentially, intentions and topics from different sites and pages, which can be fascinating, could expose opportunities for links, could expose understanding of how they view your site versus your competition or who they think your competition is.

Then there are tons of parameters, like in URL and in anchor, and da, da, da, da. In anchor doesn’t work anymore, never mind about that one.

I have to go faster, because we’re just going to run out of these. Like, come on. Interpreting and leveraging data in Google Search Console. If you don’t know how to use that, Google could be telling you, you have all sorts of errors, and you don’t know what they are.

  • Leveraging topic modeling and extraction. Using all these cool tools that are coming out for better keyword research and better on-page targeting. I talked about a couple of those at MozCon, like MonkeyLearn. There’s the new Moz Context API, which will be coming out soon, around that. There’s the Alchemy API, which a lot of folks really like and use.
  • Identifying and extracting opportunities based on site crawls. You run a Screaming Frog crawl on your site and you’re going, “Oh, here’s all these problems and issues.” If you don’t have these technical skills, you can’t diagnose that. You can’t figure out what’s wrong. You can’t figure out what needs fixing, what needs addressing.
  • Using rich snippet format to stand out in the SERPs. This is just getting a better click-through rate, which can seriously help your site and obviously your traffic.
  • Applying Google-supported protocols like rel=canonical, meta description, rel=prev/next, hreflang, robots.txt, meta robots, x robots, NOODP, XML sitemaps, rel=nofollow. The list goes on and on and on. If you’re not technical, you don’t know what those are, you think you just need to write good content and lower your bounce rate, it’s not going to work.
  • Using APIs from services like AdWords or MozScape, or hrefs from Majestic, or SEM refs from SearchScape or Alchemy API. Those APIs can have powerful things that they can do for your site. There are some powerful problems they could help you solve if you know how to use them. It’s actually not that hard to write something, even inside a Google Doc or Excel, to pull from an API and get some data in there. There’s a bunch of good tutorials out there. Richard Baxter has one, Annie Cushing has one, I think Distilled has some. So really cool stuff there.
  • Diagnosing page load speed issues, which goes right to what Jayson was talking about. You need that fast-loading page. Well, if you don’t have any technical skills, you can’t figure out why your page might not be loading quickly.
  • Diagnosing mobile friendliness issues
  • Advising app developers on the new protocols around App deep linking, so that you can get the content from your mobile apps into the web search results on mobile devices. Awesome. Super powerful. Potentially crazy powerful, as mobile search is becoming bigger than desktop.

Okay, I’m going to take a deep breath and relax. I don’t know Jayson’s intention, and in fact, if he were in this room, he’d be like, “No, I totally agree with all those things. I wrote the article in a rush. I had no idea it was going to be big. I was just trying to make the broader points around you don’t have to be a coder in order to do SEO.” That’s completely fine.

So I’m not going to try and rain criticism down on him. But I think if you’re reading that article, or you’re seeing it in your feed, or your clients are, or your boss is, or other folks are in your world, maybe you can point them to this Whiteboard Friday and let them know, no, that’s not quite right. There’s a ton of technical SEO that is required in 2015 and will be for years to come, I think, that SEOs have to have in order to be effective at their jobs.

All right, everyone. Look forward to some great comments, and we’ll see you again next time for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

The Linkbait Bump: How Viral Content Creates Long-Term Lift in Organic Traffic – Whiteboard Friday

Posted by randfish

A single fantastic (or “10x”) piece of content can lift a site’s traffic curves long beyond the popularity of that one piece. In today’s Whiteboard Friday, Rand talks about why those curves settle into a “new normal,” and how you can go about creating the content that drives that change.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

Video Transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re chatting about the linkbait bump, classic phrase in the SEO world and almost a little dated. I think today we’re talking a little bit more about viral content and how high-quality content, content that really is the cornerstone of a brand or a website’s content can be an incredible and powerful driver of traffic, not just when it initially launches but over time.

So let’s take a look.

This is a classic linkbait bump, viral content bump analytics chart. I’m seeing over here my traffic and over here the different months of the year. You know, January, February, March, like I’m under a thousand. Maybe I’m at 500 visits or something, and then I have this big piece of viral content. It performs outstandingly well from a relative standpoint for my site. It gets 10,000 or more visits, drives a ton more people to my site, and then what happens is that that traffic falls back down. But the new normal down here, new normal is higher than the old normal was. So the new normal might be at 1,000, 1,500 or 2,000 visits whereas before I was at 500.

Why does this happen?

A lot of folks see an analytics chart like this, see examples of content that’s done this for websites, and they want to know: Why does this happen and how can I replicate that effect? The reasons why are it sort of feeds back into that viral loop or the flywheel, which we’ve talked about in previous Whiteboard Fridays, where essentially you start with a piece of content. That content does well, and then you have things like more social followers on your brand’s accounts. So now next time you go to amplify content or share content socially, you’re reaching more potential people. You have a bigger audience. You have more people who share your content because they’ve seen that that content performs well for them in social. So they want to find other content from you that might help their social accounts perform well.

You see more RSS and email subscribers because people see your interesting content and go, “Hey, I want to see when these guys produce something else.” You see more branded search traffic because people are looking specifically for content from you, not necessarily just around this viral piece, although that’s often a big part of it, but around other pieces as well, especially if you do a good job of exposing them to that additional content. You get more bookmark and type in traffic, more searchers biased by personalization because they’ve already visited your site. So now when they search and they’re logged into their accounts, they’re going to see your site ranking higher than they normally would otherwise, and you get an organic SEO lift from all the links and shares and engagement.

So there’s a ton of different factors that feed into this, and you kind of want to hit all of these things. If you have a piece of content that gets a lot of shares, a lot of links, but then doesn’t promote engagement, doesn’t get more people signing up, doesn’t get more people searching for your brand or searching for that content specifically, then it’s not going to have the same impact. Your traffic might fall further and more quickly.

How do you achieve this?

How do we get content that’s going to do this? Well, we’re going to talk through a number of things that we’ve talked about previously on Whiteboard Friday. But there are some additional ones as well. This isn’t just creating good content or creating high quality content, it’s creating a particular kind of content. So for this what you want is a deep understanding, not necessarily of what your standard users or standard customers are interested in, but a deep understanding of what influencers in your niche will share and promote and why they do that.

This often means that you follow a lot of sharers and influencers in your field, and you understand, hey, they’re all sharing X piece of content. Why? Oh, because it does this, because it makes them look good, because it helps their authority in the field, because it provides a lot of value to their followers, because they know it’s going to get a lot of retweets and shares and traffic. Whatever that because is, you have to have a deep understanding of it in order to have success with viral kinds of content.

Next, you want to have empathy for users and what will give them the best possible experience. So if you know, for example, that a lot of people are coming on mobile and are going to be sharing on mobile, which is true of almost all viral content today, FYI, you need to be providing a great mobile and desktop experience. Oftentimes that mobile experience has to be different, not just responsive design, but actually a different format, a different way of being able to scroll through or watch or see or experience that content.

There are some good examples out there of content that does that. It makes a very different user experience based on the browser or the device you’re using.

You also need to be aware of what will turn them off. So promotional messages, pop-ups, trying to sell to them, oftentimes that diminishes user experience. It means that content that could have been more viral, that could have gotten more shares won’t.

Unique value and attributes that separate your content from everything else in the field. So if there’s like ABCD and whoa, what’s that? That’s very unique. That stands out from the crowd. That provides a different form of value in a different way than what everyone else is doing. That uniqueness is often a big reason why content spreads virally, why it gets more shared than just the normal stuff.

I’ve talk about this a number of times, but content that’s 10X better than what the competition provides. So unique value from the competition, but also quality that is not just a step up, but 10X better, massively, massively better than what else you can get out there. That makes it unique enough. That makes it stand out from the crowd, and that’s a very hard thing to do, but that’s why this is so rare and so valuable.

This is a critical one, and I think one that, I’ll just say, many organizations fail at. That is the freedom and support to fail many times, to try to create these types of effects, to have this impact many times before you hit on a success. A lot of managers and clients and teams and execs just don’t give marketing teams and content teams the freedom to say, “Yeah, you know what? You spent a month and developer resources and designer resources and spent some money to go do some research and contracted with this third party, and it wasn’t a hit. It didn’t work. We didn’t get the viral content bump. It just kind of did okay. You know what? We believe in you. You’ve got a lot of chances. You should try this another 9 or 10 times before we throw it out. We really want to have a success here.”

That is something that very few teams invest in. The powerful thing is because so few people are willing to invest that way, the ones that do, the ones that believe in this, the ones that invest long term, the ones that are willing to take those failures are going to have a much better shot at success, and they can stand out from the crowd. They can get these bumps. It’s powerful.

Not a requirement, but it really, really helps to have a strong engaged community, either on your site and around your brand, or at least in your niche and your topic area that will help, that wants to see you, your brand, your content succeed. If you’re in a space that has no community, I would work on building one, even if it’s very small. We’re not talking about building a community of thousands or tens of thousands. A community of 100 people, a community of 50 people even can be powerful enough to help content get that catalyst, that first bump that’ll boost it into viral potential.

Then finally, for this type of content, you need to have a logical and not overly promotional match between your brand and the content itself. You can see many sites in what I call sketchy niches. So like a criminal law site or a casino site or a pharmaceutical site that’s offering like an interactive musical experience widget, and you’re like, “Why in the world is this brand promoting this content? Why did they even make it? How does that match up with what they do? Oh, it’s clearly just intentionally promotional.”

Look, many of these brands go out there and they say, “Hey, the average web user doesn’t know and doesn’t care.” I agree. But the average web user is not an influencer. Influencers know. Well, they’re very, very suspicious of why content is being produced and promoted, and they’re very skeptical of promoting content that they don’t think is altruistic. So this kills a lot of content for brands that try and invest in it when there’s no match. So I think you really need that.

Now, when you do these linkbait bump kinds of things, I would strongly recommend that you follow up, that you consider the quality of the content that you’re producing. Thereafter, that you invest in reproducing these resources, keeping those resources updated, and that you don’t simply give up on content production after this. However, if you’re a small business site, a small or medium business, you might think about only doing one or two of these a year. If you are a heavy content player, you’re doing a lot of content marketing, content marketing is how you’re investing in web traffic, I’d probably be considering these weekly or monthly at the least.

All right, everyone. Look forward to your experiences with the linkbait bump, and I will see you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

Controlling Search Engine Crawlers for Better Indexation and Rankings – Whiteboard Friday

Posted by randfish

When should you disallow search engines in your robots.txt file, and when should you use meta robots tags in a page header? What about nofollowing links? In today’s Whiteboard Friday, Rand covers these tools and their appropriate use in four situations that SEOs commonly find themselves facing.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

Video transcription

Howdy Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re going to talk about controlling search engine crawlers, blocking bots, sending bots where we want, restricting them from where we don’t want them to go. We’re going to talk a little bit about crawl budget and what you should and shouldn’t have indexed.

As a start, what I want to do is discuss the ways in which we can control robots. Those include the three primary ones: robots.txt, meta robots, and—well, the nofollow tag is a little bit less about controlling bots.

There are a few others that we’re going to discuss as well, including Webmaster Tools (Search Console) and URL status codes. But let’s dive into those first few first.

Robots.txt lives at yoursite.com/robots.txt, it tells crawlers what they should and shouldn’t access, it doesn’t always get respected by Google and Bing. So a lot of folks when you say, “hey, disallow this,” and then you suddenly see those URLs popping up and you’re wondering what’s going on, look—Google and Bing oftentimes think that they just know better. They think that maybe you’ve made a mistake, they think “hey, there’s a lot of links pointing to this content, there’s a lot of people who are visiting and caring about this content, maybe you didn’t intend for us to block it.” The more specific you get about an individual URL, the better they usually are about respecting it. The less specific, meaning the more you use wildcards or say “everything behind this entire big directory,” the worse they are about necessarily believing you.

Meta robots—a little different—that lives in the headers of individual pages, so you can only control a single page with a meta robots tag. That tells the engines whether or not they should keep a page in the index, and whether they should follow the links on that page, and it’s usually a lot more respected, because it’s at an individual-page level; Google and Bing tend to believe you about the meta robots tag.

And then the nofollow tag, that lives on an individual link on a page. It doesn’t tell engines where to crawl or not to crawl. All it’s saying is whether you editorially vouch for a page that is being linked to, and whether you want to pass the PageRank and link equity metrics to that page.

Interesting point about meta robots and robots.txt working together (or not working together so well)—many, many folks in the SEO world do this and then get frustrated.

What if, for example, we take a page like “blogtest.html” on our domain and we say “all user agents, you are not allowed to crawl blogtest.html. Okay—that’s a good way to keep that page away from being crawled, but just because something is not crawled doesn’t necessarily mean it won’t be in the search results.

So then we have our SEO folks go, “you know what, let’s make doubly sure that doesn’t show up in search results; we’ll put in the meta robots tag:”

<meta name="robots" content="noindex, follow">

So, “noindex, follow” tells the search engine crawler they can follow the links on the page, but they shouldn’t index this particular one.

Then, you go and run a search for “blog test” in this case, and everybody on the team’s like “What the heck!? WTF? Why am I seeing this page show up in search results?”

The answer is, you told the engines that they couldn’t crawl the page, so they didn’t. But they are still putting it in the results. They’re actually probably not going to include a meta description; they might have something like “we can’t include a meta description because of this site’s robots.txt file.” The reason it’s showing up is because they can’t see the noindex; all they see is the disallow.

So, if you want something truly removed, unable to be seen in search results, you can’t just disallow a crawler. You have to say meta “noindex” and you have to let them crawl it.

So this creates some complications. Robots.txt can be great if we’re trying to save crawl bandwidth, but it isn’t necessarily ideal for preventing a page from being shown in the search results. I would not recommend, by the way, that you do what we think Twitter recently tried to do, where they tried to canonicalize www and non-www by saying “Google, don’t crawl the www version of twitter.com.” What you should be doing is rel canonical-ing or using a 301.

Meta robots—that can allow crawling and link-following while disallowing indexation, which is great, but it requires crawl budget and you can still conserve indexing.

The nofollow tag, generally speaking, is not particularly useful for controlling bots or conserving indexation.

Webmaster Tools (now Google Search Console) has some special things that allow you to restrict access or remove a result from the search results. For example, if you have 404’d something or if you’ve told them not to crawl something but it’s still showing up in there, you can manually say “don’t do that.” There are a few other crawl protocol things that you can do.

And then URL status codes—these are a valid way to do things, but they’re going to obviously change what’s going on on your pages, too.

If you’re not having a lot of luck using a 404 to remove something, you can use a 410 to permanently remove something from the index. Just be aware that once you use a 410, it can take a long time if you want to get that page re-crawled or re-indexed, and you want to tell the search engines “it’s back!” 410 is permanent removal.

301—permanent redirect, we’ve talked about those here—and 302, temporary redirect.

Now let’s jump into a few specific use cases of “what kinds of content should and shouldn’t I allow engines to crawl and index” in this next version…

[Rand moves at superhuman speed to erase the board and draw part two of this Whiteboard Friday. Seriously, we showed Roger how fast it was, and even he was impressed.]

Four crawling/indexing problems to solve

So we’ve got these four big problems that I want to talk about as they relate to crawling and indexing.

1. Content that isn’t ready yet

The first one here is around, “If I have content of quality I’m still trying to improve—it’s not yet ready for primetime, it’s not ready for Google, maybe I have a bunch of products and I only have the descriptions from the manufacturer and I need people to be able to access them, so I’m rewriting the content and creating unique value on those pages… they’re just not ready yet—what should I do with those?”

My options around crawling and indexing? If I have a large quantity of those—maybe thousands, tens of thousands, hundreds of thousands—I would probably go the robots.txt route. I’d disallow those pages from being crawled, and then eventually as I get (folder by folder) those sets of URLs ready, I can then allow crawling and maybe even submit them to Google via an XML sitemap.

If I’m talking about a small quantity—a few dozen, a few hundred pages—well, I’d probably just use the meta robots noindex, and then I’d pull that noindex off of those pages as they are made ready for Google’s consumption. And then again, I would probably use the XML sitemap and start submitting those once they’re ready.

2. Dealing with duplicate or thin content

What about, “Should I noindex, nofollow, or potentially disallow crawling on largely duplicate URLs or thin content?” I’ve got an example. Let’s say I’m an ecommerce shop, I’m selling this nice Star Wars t-shirt which I think is kind of hilarious, so I’ve got starwarsshirt.html, and it links out to a larger version of an image, and that’s an individual HTML page. It links out to different colors, which change the URL of the page, so I have a gray, blue, and black version. Well, these four pages are really all part of this same one, so I wouldn’t recommend disallowing crawling on these, and I wouldn’t recommend noindexing them. What I would do there is a rel canonical.

Remember, rel canonical is one of those things that can be precluded by disallowing. So, if I were to disallow these from being crawled, Google couldn’t see the rel canonical back, so if someone linked to the blue version instead of the default version, now I potentially don’t get link credit for that. So what I really want to do is use the rel canonical, allow the indexing, and allow it to be crawled. If you really feel like it, you could also put a meta “noindex, follow” on these pages, but I don’t really think that’s necessary, and again that might interfere with the rel canonical.

3. Passing link equity without appearing in search results

Number three: “If I want to pass link equity (or at least crawling) through a set of pages without those pages actually appearing in search results—so maybe I have navigational stuff, ways that humans are going to navigate through my pages, but I don’t need those appearing in search results—what should I use then?”

What I would say here is, you can use the meta robots to say “don’t index the page, but do follow the links that are on that page.” That’s a pretty nice, handy use case for that.

Do NOT, however, disallow those in robots.txt—many, many folks make this mistake. What happens if you disallow crawling on those, Google can’t see the noindex. They don’t know that they can follow it. Granted, as we talked about before, sometimes Google doesn’t obey the robots.txt, but you can’t rely on that behavior. Trust that the disallow in robots.txt will prevent them from crawling. So I would say, the meta robots “noindex, follow” is the way to do this.

4. Search results-type pages

Finally, fourth, “What should I do with search results-type pages?” Google has said many times that they don’t like your search results from your own internal engine appearing in their search results, and so this can be a tricky use case.

Sometimes a search result page—a page that lists many types of results that might come from a database of types of content that you’ve got on your site—could actually be a very good result for a searcher who is looking for a wide variety of content, or who wants to see what you have on offer. Yelp does this: When you say, “I’m looking for restaurants in Seattle, WA,” they’ll give you what is essentially a list of search results, and Google does want those to appear because that page provides a great result. But you should be doing what Yelp does there, and make the most common or popular individual sets of those search results into category-style pages. A page that provides real, unique value, that’s not just a list of search results, that is more of a landing page than a search results page.

However, that being said, if you’ve got a long tail of these, or if you’d say “hey, our internal search engine, that’s really for internal visitors only—it’s not useful to have those pages show up in search results, and we don’t think we need to make the effort to make those into category landing pages.” Then you can use the disallow in robots.txt to prevent those.

Just be cautious here, because I have sometimes seen an over-swinging of the pendulum toward blocking all types of search results, and sometimes that can actually hurt your SEO and your traffic. Sometimes those pages can be really useful to people. So check your analytics, and make sure those aren’t valuable pages that should be served up and turned into landing pages. If you’re sure, then go ahead and disallow all your search results-style pages. You’ll see a lot of sites doing this in their robots.txt file.

That being said, I hope you have some great questions about crawling and indexing, controlling robots, blocking robots, allowing robots, and I’ll try and tackle those in the comments below.

We’ll look forward to seeing you again next week for another edition of Whiteboard Friday. Take care!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

UX, Content Quality, and SEO – Whiteboard Friday

Posted by EricEnge

Editor’s note: Today we’re featuring back-to-back episodes of Whiteboard Friday from our friends at Stone Temple Consulting. Make sure to also check out the first episode, “Becoming Better SEO Scientists” from Mark Traphagen.

User experience and the quality of your content have an incredibly broad impact on your SEO efforts. In this episode of Whiteboard Friday, Stone Temple’s Eric Enge shows you how paying attention to your users can benefit your position in the SERPs.

For reference, here’s a still of this week’s whiteboard.
Click on it to open a high resolution image in a new tab!

Video transcription

Hi, Mozzers. I’m Eric Enge, CEO of Stone Temple Consulting. Today I want to talk to you about one of the most underappreciated aspects of SEO, and that is the interaction between user experience, content quality, and your SEO rankings and traffic.

I’m going to take you through a little history first. You know, we all know about the Panda algorithm update that came out in February 23, 2011, and of course more recently we have the search quality update that came out in May 19, 2015. Our Panda friend had 27 different updates that we know of along the way. So a lot of stuff has gone on, but we need to realize that that is not where it all started.

The link algorithm from the very beginning was about search quality. Links allowed Google to have an algorithm that gave better results than the other search engines of their day, which were dependent on keywords. These things however, that I’ve just talked about, are still just the tip of the iceberg. Google goes a lot deeper than that, and I want to walk you through the different things that it does.

So consider for a moment, you have someone search on the phrase “men’s shoes” and they come to your website.

What is that they want when they come to your website? Do they want sneakers, sandals, dress shoes? Well, those are sort of the obvious things that they might want. But you need to think a little bit more about what the user really wants to be able to know before they buy from you.

First of all, there has to be a way to buy. By the way, affiliate sites don’t have ways to buy. So the line of thinking I’m talking about might not work out so well for affiliate sites and works better for people who can actually sell the product directly. But in addition to a way to buy, they might want a privacy policy. They might want to see an About Us page. They might want to be able to see your phone number. These are all different kinds of things that users look for when they arrive on the pages of your site.

So as we think about this, what is it that we can do to do a better job with our websites? Well, first of all, lose the focus on keywords. Don’t get me wrong, keywords haven’t gone entirely away. But the pages where we overemphasize one particular keyword over another or related phrases are long gone, and you need to have a broader focus on how you approach things.

User experience is now a big deal. You really need to think about how users are interacting with your page and how that shows your overall page quality. Think about the percent satisfaction. If I send a hundred users to your page from my search engine, how many of those users are going to be happy with the content or the products or everything that they see with your page? You need to think through the big picture. So at the end of the day, this impacts the content on your page to be sure, but a lot more than that it impacts the design, related items that you have on the page.

So let me just give you an example of that. I looked at one page recently that was for a flower site. It was a page about annuals on that site, and that page had no link to their perennials page. Well, okay, a fairly good percentage of people who arrive on a page about annuals are also going to want to have perennials as something they might consider buying. So that page was probably coming across as a poor user experience. So these related items concepts are incredibly important.

Then the links to your page is actually a way to get to some of those related items, and so those are really important as well. What are the related products that you link to?

Finally, really it impacts everything you do with your page design. You need to move past the old-fashioned way of thinking about SEO and into the era of: How am I doing with satisfying all the people who come to the pages of your site?

Thank you, Mozzers. Have a great day.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

Why We Can’t Do Keyword Research Like It’s 2010 – Whiteboard Friday

Posted by randfish

Keyword Research is a very different field than it was just five years ago, and if we don’t keep up with the times we might end up doing more harm than good. From the research itself to the selection and targeting process, in today’s Whiteboard Friday Rand explains what has changed and what we all need to do to conduct effective keyword research today.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

What do we need to change to keep up with the changing world of keyword research?

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re going to chat a little bit about keyword research, why it’s changed from the last five, six years and what we need to do differently now that things have changed. So I want to talk about changing up not just the research but also the selection and targeting process.

There are three big areas that I’ll cover here. There’s lots more in-depth stuff, but I think we should start with these three.

1) The Adwords keyword tool hides data!

This is where almost all of us in the SEO world start and oftentimes end with our keyword research. We go to AdWords Keyword Tool, what used to be the external keyword tool and now is inside AdWords Ad Planner. We go inside that tool, and we look at the volume that’s reported and we sort of record that as, well, it’s not good, but it’s the best we’re going to do.

However, I think there are a few things to consider here. First off, that tool is hiding data. What I mean by that is not that they’re not telling the truth, but they’re not telling the whole truth. They’re not telling nothing but the truth, because those rounded off numbers that you always see, you know that those are inaccurate. Anytime you’ve bought keywords, you’ve seen that the impression count never matches the count that you see in the AdWords tool. It’s not usually massively off, but it’s often off by a good degree, and the only thing it’s great for is telling relative volume from one from another.

But because AdWords hides data essentially by saying like, “Hey, you’re going to type in . . .” Let’s say I’m going to type in “college tuition,” and Google knows that a lot of people search for how to reduce college tuition, but that doesn’t come up in the suggestions because it’s not a commercial term, or they don’t think that an advertiser who bids on that is going to do particularly well and so they don’t show it in there. I’m giving an example. They might indeed show that one.

But because that data is hidden, we need to go deeper. We need to go beyond and look at things like Google Suggest and related searches, which are down at the bottom. We need to start conducting customer interviews and staff interviews, which hopefully has always been part of your brainstorming process but really needs to be now. Then you can apply that to AdWords. You can apply that to suggest and related.

The beautiful thing is once you get these tools from places like visiting forums or communities, discussion boards and seeing what terms and phrases people are using, you can collect all this stuff up, plug it back into AdWords, and now they will tell you how much volume they’ve got. So you take that how to lower college tuition term, you plug it into AdWords, they will show you a number, a non-zero number. They were just hiding it in the suggestions because they thought, “Hey, you probably don’t want to bid on that. That won’t bring you a good ROI.” So you’ve got to be careful with that, especially when it comes to SEO kinds of keyword research.

2) Building separate pages for each term or phrase doesn’t make sense

It used to be the case that we built separate pages for every single term and phrase that was in there, because we wanted to have the maximum keyword targeting that we could. So it didn’t matter to us that college scholarship and university scholarships were essentially people looking for exactly the same thing, just using different terminology. We would make one page for one and one page for the other. That’s not the case anymore.

Today, we need to group by the same searcher intent. If two searchers are searching for two different terms or phrases but both of them have exactly the same intent, they want the same information, they’re looking for the same answers, their query is going to be resolved by the same content, we want one page to serve those, and that’s changed up a little bit of how we’ve done keyword research and how we do selection and targeting as well.

3) Build your keyword consideration and prioritization spreadsheet with the right metrics

Everybody’s got an Excel version of this, because I think there’s just no awesome tool out there that everyone loves yet that kind of solves this problem for us, and Excel is very, very flexible. So we go into Excel, we put in our keyword, the volume, and then a lot of times we almost stop there. We did keyword volume and then like value to the business and then we prioritize.

What are all these new columns you’re showing me, Rand? Well, here I think is how sophisticated, modern SEOs that I’m seeing in the more advanced agencies, the more advanced in-house practitioners, this is what I’m seeing them add to the keyword process.

Difficulty

A lot of folks have done this, but difficulty helps us say, “Hey, this has a lot of volume, but it’s going to be tremendously hard to rank.”

The difficulty score that Moz uses and attempts to calculate is a weighted average of the top 10 domain authorities. It also uses page authority, so it’s kind of a weighted stack out of the two. If you’re seeing very, very challenging pages, very challenging domains to get in there, it’s going to be super hard to rank against them. The difficulty is high. For all of these ones it’s going to be high because college and university terms are just incredibly lucrative.

That difficulty can help bias you against chasing after terms and phrases for which you are very unlikely to rank for at least early on. If you feel like, “Hey, I already have a powerful domain. I can rank for everything I want. I am the thousand pound gorilla in my space,” great. Go after the difficulty of your choice, but this helps prioritize.

Opportunity

This is actually very rarely used, but I think sophisticated marketers are using it extremely intelligently. Essentially what they’re saying is, “Hey, if you look at a set of search results, sometimes there are two or three ads at the top instead of just the ones on the sidebar, and that’s biasing some of the click-through rate curve.” Sometimes there’s an instant answer or a Knowledge Graph or a news box or images or video, or all these kinds of things that search results can be marked up with, that are not just the classic 10 web results. Unfortunately, if you’re building a spreadsheet like this and treating every single search result like it’s just 10 blue links, well you’re going to lose out. You’re missing the potential opportunity and the opportunity cost that comes with ads at the top or all of these kinds of features that will bias the click-through rate curve.

So what I’ve seen some really smart marketers do is essentially build some kind of a framework to say, “Hey, you know what? When we see that there’s a top ad and an instant answer, we’re saying the opportunity if I was ranking number 1 is not 10 out of 10. I don’t expect to get whatever the average traffic for the number 1 position is. I expect to get something considerably less than that. Maybe something around 60% of that, because of this instant answer and these top ads.” So I’m going to mark this opportunity as a 6 out of 10.

There are 2 top ads here, so I’m giving this a 7 out of 10. This has two top ads and then it has a news block below the first position. So again, I’m going to reduce that click-through rate. I think that’s going down to a 6 out of 10.

You can get more and less scientific and specific with this. Click-through rate curves are imperfect by nature because we truly can’t measure exactly how those things change. However, I think smart marketers can make some good assumptions from general click-through rate data, which there are several resources out there on that to build a model like this and then include it in their keyword research.

This does mean that you have to run a query for every keyword you’re thinking about, but you should be doing that anyway. You want to get a good look at who’s ranking in those search results and what kind of content they’re building . If you’re running a keyword difficulty tool, you are already getting something like that.

Business value

This is a classic one. Business value is essentially saying, “What’s it worth to us if visitors come through with this search term?” You can get that from bidding through AdWords. That’s the most sort of scientific, mathematically sound way to get it. Then, of course, you can also get it through your own intuition. It’s better to start with your intuition than nothing if you don’t already have AdWords data or you haven’t started bidding, and then you can refine your sort of estimate over time as you see search visitors visit the pages that are ranking, as you potentially buy those ads, and those kinds of things.

You can get more sophisticated around this. I think a 10 point scale is just fine. You could also use a one, two, or three there, that’s also fine.

Requirements or Options

Then I don’t exactly know what to call this column. I can’t remember the person who’ve showed me theirs that had it in there. I think they called it Optional Data or Additional SERPs Data, but I’m going to call it Requirements or Options. Requirements because this is essentially saying, “Hey, if I want to rank in these search results, am I seeing that the top two or three are all video? Oh, they’re all video. They’re all coming from YouTube. If I want to be in there, I’ve got to be video.”

Or something like, “Hey, I’m seeing that most of the top results have been produced or updated in the last six months. Google appears to be biasing to very fresh information here.” So, for example, if I were searching for “university scholarships Cambridge 2015,” well, guess what? Google probably wants to bias to show results that have been either from the official page on Cambridge’s website or articles from this year about getting into that university and the scholarships that are available or offered. I saw those in two of these search results, both the college and university scholarships had a significant number of the SERPs where a fresh bump appeared to be required. You can see that a lot because the date will be shown ahead of the description, and the date will be very fresh, sometime in the last six months or a year.

Prioritization

Then finally I can build my prioritization. So based on all the data I had here, I essentially said, “Hey, you know what? These are not 1 and 2. This is actually 1A and 1B, because these are the same concepts. I’m going to build a single page to target both of those keyword phrases.” I think that makes good sense. Someone who is looking for college scholarships, university scholarships, same intent.

I am giving it a slight prioritization, 1A versus 1B, and the reason I do this is because I always have one keyword phrase that I’m leaning on a little more heavily. Because Google isn’t perfect around this, the search results will be a little different. I want to bias to one versus the other. In this case, my title tag, since I more targeting university over college, I might say something like college and university scholarships so that university and scholarships are nicely together, near the front of the title, that kind of thing. Then 1B, 2, 3.

This is kind of the way that modern SEOs are building a more sophisticated process with better data, more inclusive data that helps them select the right kinds of keywords and prioritize to the right ones. I’m sure you guys have built some awesome stuff. The Moz community is filled with very advanced marketers, probably plenty of you who’ve done even more than this.

I look forward to hearing from you in the comments. I would love to chat more about this topic, and we’ll see you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

How to Use Server Log Analysis for Technical SEO

Posted by SamuelScott

It’s ten o’clock. Do you know where your logs are?

I’m introducing this guide with a pun on a common public-service announcement that has run on late-night TV news broadcasts in the United States because log analysis is something that is extremely newsworthy and important.

If your technical and on-page SEO is poor, then nothing else that you do will matter. Technical SEO is the key to helping search engines to crawl, parse, and index websites, and thereby rank them appropriately long before any marketing work begins.

The important thing to remember: Your log files contain the only data that is 100% accurate in terms of how search engines are crawling your website. By helping Google to do its job, you will set the stage for your future SEO work and make your job easier. Log analysis is one facet of technical SEO, and correcting the problems found in your logs will help to lead to higher rankings, more traffic, and more conversions and sales.

Here are just a few reasons why:

  • Too many response code errors may cause Google to reduce its crawling of your website and perhaps even your rankings.
  • You want to make sure that search engines are crawling everything, new and old, that you want to appear and rank in the SERPs (and nothing else).
  • It’s crucial to ensure that all URL redirections will pass along any incoming “link juice.”

However, log analysis is something that is unfortunately discussed all too rarely in SEO circles. So, here, I wanted to give the Moz community an introductory guide to log analytics that I hope will help. If you have any questions, feel free to ask in the comments!

What is a log file?

Computer servers, operating systems, network devices, and computer applications automatically generate something called a log entry whenever they perform an action. In a SEO and digital marketing context, one type of action is whenever a page is requested by a visiting bot or human.

Server log entries are specifically programmed to be output in the Common Log Format of the W3C consortium. Here is one example from Wikipedia with my accompanying explanations:

127.0.0.1 user-identifier frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
  • 127.0.0.1 — The remote hostname. An IP address is shown, like in this example, whenever the DNS hostname is not available or DNSLookup is turned off.
  • user-identifier — The remote logname / RFC 1413 identity of the user. (It’s not that important.)
  • frank — The user ID of the person requesting the page. Based on what I see in my Moz profile, Moz’s log entries would probably show either “SamuelScott” or “392388” whenever I visit a page after having logged in.
  • [10/Oct/2000:13:55:36 -0700] — The date, time, and timezone of the action in question in strftime format.
  • GET /apache_pb.gif HTTP/1.0 — “GET” is one of the two commands (the other is “POST”) that can be performed. “GET” fetches a URL while “POST” is submitting something (such as a forum comment). The second part is the URL that is being accessed, and the last part is the version of HTTP that is being accessed.
  • 200 — The status code of the document that was returned.
  • 2326 — The size, in bytes, of the document that was returned.

Note: A hyphen is shown in a field when that information is unavailable.

Every single time that you — or the Googlebot — visit a page on a website, a line with this information is output, recorded, and stored by the server.

Log entries are generated continuously and anywhere from several to thousands can be created every second — depending on the level of a given server, network, or application’s activity. A collection of log entries is called a log file (or often in slang, “the log” or “the logs”), and it is displayed with the most-recent log entry at the bottom. Individual log files often contain a calendar day’s worth of log entries.

Accessing your log files

Different types of servers store and manage their log files differently. Here are the general guides to finding and managing log data on three of the most-popular types of servers:

What is log analysis?

Log analysis (or log analytics) is the process of going through log files to learn something from the data. Some common reasons include:

  • Development and quality assurance (QA) — Creating a program or application and checking for problematic bugs to make sure that it functions properly
  • Network troubleshooting — Responding to and fixing system errors in a network
  • Customer service — Determining what happened when a customer had a problem with a technical product
  • Security issues — Investigating incidents of hacking and other intrusions
  • Compliance matters — Gathering information in response to corporate or government policies
  • Technical SEO — This is my favorite! More on that in a bit.

Log analysis is rarely performed regularly. Usually, people go into log files only in response to something — a bug, a hack, a subpoena, an error, or a malfunction. It’s not something that anyone wants to do on an ongoing basis.

Why? This is a screenshot of ours of just a very small part of an original (unstructured) log file:

Ouch. If a website gets 10,000 visitors who each go to ten pages per day, then the server will create a log file every day that will consist of 100,000 log entries. No one has the time to go through all of that manually.

How to do log analysis

There are three general ways to make log analysis easier in SEO or any other context:

  • Do-it-yourself in Excel
  • Proprietary software such as Splunk or Sumo-logic
  • The ELK Stack open-source software

Tim Resnik’s Moz essay from a few years ago walks you through the process of exporting a batch of log files into Excel. This is a (relatively) quick and easy way to do simple log analysis, but the downside is that one will see only a snapshot in time and not any overall trends. To obtain the best data, it’s crucial to use either proprietary tools or the ELK Stack.

Splunk and Sumo-Logic are proprietary log analysis tools that are primarily used by enterprise companies. The ELK Stack is a free and open-source batch of three platforms (Elasticsearch, Logstash, and Kibana) that is owned by Elastic and used more often by smaller businesses. (Disclosure: We at Logz.io use the ELK Stack to monitor our own internal systems as well as for the basis of our own log management software.)

For those who are interested in using this process to do technical SEO analysis, monitor system or application performance, or for any other reason, our CEO, Tomer Levy, has written a guide to deploying the ELK Stack.

Technical SEO insights in log data

However you choose to access and understand your log data, there are many important technical SEO issues to address as needed. I’ve included screenshots of our technical SEO dashboard with our own website’s data to demonstrate what to examine in your logs.

Bot crawl volume

It’s important to know the number of requests made by Baidu, BingBot, GoogleBot, Yahoo, Yandex, and others over a given period time. If, for example, you want to get found in search in Russia but Yandex is not crawling your website, that is a problem. (You’d want to consult Yandex Webmaster and see this article on Search Engine Land.)

Response code errors

Moz has a great primer on the meanings of the different status codes. I have an alert system setup that tells me about 4XX and 5XX errors immediately because those are very significant.

Temporary redirects

Temporary 302 redirects do not pass along the “link juice” of external links from the old URL to the new one. Almost all of the time, they should be changed to permanent 301 redirects.

Crawl budget waste

Google assigns a crawl budget to each website based on numerous factors. If your crawl budget is, say, 100 pages per day (or the equivalent amount of data), then you want to be sure that all 100 are things that you want to appear in the SERPs. No matter what you write in your robots.txt file and meta-robots tags, you might still be wasting your crawl budget on advertising landing pages, internal scripts, and more. The logs will tell you — I’ve outlined two script-based examples in red above.

If you hit your crawl limit but still have new content that should be indexed to appear in search results, Google may abandon your site before finding it.

Duplicate URL crawling

The addition of URL parameters — typically used in tracking for marketing purposes — often results in search engines wasting crawl budgets by crawling different URLs with the same content. To learn how to address this issue, I recommend reading the resources on Google and Search Engine Land here, here, here, and here.

Crawl priority

Google might be ignoring (and not crawling or indexing) a crucial page or section of your website. The logs will reveal what URLs and/or directories are getting the most and least attention. If, for example, you have published an e-book that attempts to rank for targeted search queries but it sits in a directory that Google only visits once every six months, then you won’t get any organic search traffic from the e-book for up to six months.

If a part of your website is not being crawled very often — and it is updated often enough that it should be — then you might need to check your internal-linking structure and the crawl-priority settings in your XML sitemap.

Last crawl date

Have you uploaded something that you hope will be indexed quickly? The log files will tell you when Google has crawled it.

Crawl budget

One thing I personally like to check and see is Googlebot’s real-time activity on our site because the crawl budget that the search engine assigns to a website is a rough indicator — a very rough one — of how much it “likes” your site. Google ideally does not want to waste valuable crawling time on a bad website. Here, I had seen that Googlebot had made 154 requests of our new startup’s website over the prior twenty-four hours. Hopefully, that number will go up!

As I hope you can see, log analysis is critically important in technical SEO. It’s eleven o’clock — do you know where your logs are now?

Additional resources

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

Simple Steps for Conducting Creative Content Research

Posted by Hannah_Smith

Most frequently, the content we create at Distilled is designed to attract press coverage, social shares, and exposure (and links) on sites our clients’ target audience reads. That’s a tall order.

Over the years we’ve had our hits and misses, and through this we’ve recognised the value of learning about what makes a piece of content successful. Coming up with a great idea is difficult, and it can be tough to figure out where to begin. Today, rather than leaping headlong into brainstorming sessions, we start with creative content research.

What is creative content research?

Creative content research enables you to answer the questions:

“What are websites publishing, and what are people sharing?”

From this, you’ll then have a clearer view on what might be successful for your client.

A few years ago this required quite an amount of work to figure out. Today, happily, it’s much quicker and easier. In this post I’ll share the process and tools we use.

Whoa there… Why do I need to do this?

I think that the value in this sort of activity lies in a couple of directions:

a) You can learn a lot by deconstructing the success of others…

I’ve been taking stuff apart to try to figure out how it works for about as long as I can remember, so applying this process to content research felt pretty natural to me. Perhaps more importantly though, I think that deconstructing content is actually easier when it isn’t your own. You’re not involved, invested, or in love with the piece so viewing it objectively and learning from it is much easier.

b) Your research will give you a clear overview of the competitive landscape…

As soon as a company elects to start creating content, they gain a whole raft of new competitors. In addition to their commercial competitors (i.e. those who offer similar products or services), the company also gains content competitors. For example, if you’re a sports betting company and plan to create content related to the sports events that you’re offering betting markets on; then you’re competing not just with other betting companies, but every other publisher who creates content about these events. That means major news outlets, sports news site, fan sites, etc. To make matters even more complicated, it’s likely that you’ll actually be seeking coverage from those same content competitors. As such, you need to understand what’s already being created in the space before creating content of your own.

c) You’re giving yourself the data to create a more compelling pitch…

At some point you’re going to need to pitch your ideas to your client (or your boss if you’re working in-house). At Distilled, we’ve found that getting ideas signed off can be really tough. Ultimately, a great idea is worthless if we can’t persuade our client to give us the green light. This research can be used to make a more compelling case to your client and get those ideas signed off. (Incidentally, if getting ideas signed off is proving to be an issue you might find this framework for pitching creative ideas useful).

Where to start

Good ideas start with a good brief, however it can be tough to pin clients down to get answers to a long list of questions.

As a minimum you’ll need to know the following:

  • Who are they looking to target?
    • Age, sex, demographic
    • What’s their core focus? What do they care about? What problems are they looking to solve?
    • Who influences them?
    • What else are they interested in?
    • Where do they shop and which brands do they buy?
    • What do they read?
    • What do they watch on TV?
    • Where do they spend their time online?
  • Where do they want to get coverage?
    • We typically ask our clients to give us a wishlist of 10 or so sites they’d love to get coverage on
  • Which topics are they comfortable covering?
    • This question is often the toughest, particularly if a client hasn’t created content specifically for links and shares before. Often clients are uncomfortable about drifting too far away from their core business—for example, if they sell insurance, they’ll typically say that they really want to create a piece of content about insurance. Whilst this is understandable from the clients’ perspective it can severely limit their chances of success. It’s definitely worth offering up a gentle challenge at this stage—I’ll often cite Red Bull, who are a great example of a company who create content based on what their consumers love, not what they sell (i.e. Red Bull sell soft drinks, but create content about extreme sports because that’s the sort of content their audience love to consume). It’s worth planting this idea early, but don’t get dragged into a fierce debate at this stage—you’ll be able to make a far more compelling argument once you’ve done your research and are pitching concrete ideas.

Processes, useful tools and sites

Now you have your brief, it’s time to begin your research.

Given that we’re looking to uncover “what websites are publishing and what’s being shared,” It won’t surprise you to learn that I pay particular attention to pieces of content and the coverage they receive. For each piece that I think is interesting I’ll note down the following:

  • The title/headline
  • A link to the coverage (and to the original piece if applicable)
  • How many social shares the coverage earned (and the original piece earned)
  • The number of linking root domains the original piece earned
  • Some notes about the piece itself: why it’s interesting, why I think it got shares/coverage
  • Any gaps in the content, whether or not it’s been executed well
  • How we might do something similar (if applicable)

Whilst I’m doing this I’ll also make a note of specific sites I see being frequently shared (I tend to check these out separately later on), any interesting bits of research (particularly if I think there might be an opportunity to do something different with the data), interesting threads on forums etc.

When it comes to kicking off your research, you can start wherever you like, but I’d recommend that you cover off each of the areas below:

What does your target audience share?

Whilst this activity might not uncover specific pieces of successful content, it’s a great way of getting a clearer understanding of your target audience, and getting a handle on the sites they read and the topics which interest them.

  • Review social profiles / feeds
    • If the company you’re working for has a Facebook page, it shouldn’t be too difficult to find some people who’ve liked the company page and have a public profile. It’s even easier on Twitter where most profiles are public. Whilst this won’t give you quantitative data, it does put a human face to your audience data and gives you a feel for what these people care about and share. In addition to uncovering specific pieces of content, this can also provide inspiration in terms of other sites you might want to investigate further and ideas for topics you might want to explore.
  • Demographics Pro
    • This service infers demographic data from your clients’ Twitter followers. I find it particularly useful if the client doesn’t know too much about their audience. In addition to demographic data, you get a breakdown of professions, interests, brand affiliations, and the other Twitter accounts they follow and who they’re most influenced by. This is a paid-for service, but there are pay-as-you-go options in addition to pay monthly plans.

Finding successful pieces of content on specific sites

If you’ve a list of sites you know your target audience read, and/or you know your client wants to get coverage on, there are a bunch of ways you can uncover interesting content:

  • Using your link research tool of choice (e.g. Open Site Explorer, Majestic, ahrefs) you can run a domain level report to see which pages have attracted the most links. This can also be useful if you want to check out commercial competitors to see which pieces of content they’ve created have attracted the most links.
  • There are also tools which enable you to uncover the most shared content on individual sites. You can use Buzzsumo to run content analysis reports on individual domains which provide data on average social shares per post, social shares by network, and social shares by content type.
  • If you just want to see the most shared content for a given domain you can run a simple search on Buzzsumo using the domain; and there’s also the option to refine by topic. For example a search like [guardian.com big data] will return the most shared content on the Guardian related to big data. You can also run similar reports using ahrefs’ Content Explorer tool.

Both Buzzsumo and ahrefs are paid tools, but both offer free trials. If you need to explore the most shared content without using a paid tool, there are other alternatives. Check out Social Crawlytics which will crawl domains and return social share data, or alternatively, you can crawl a site (or section of a site) and then run the URLs through SharedCount‘s bulk upload feature.

Finding successful pieces of content by topic

When searching by topic, I find it best to begin with a broad search and then drill down into more specific areas. For example, if I had a client in the financial services space, I’d start out looking at a broad topic like “money” rather than shooting straight to topics like loans or credit cards.

As mentioned above, both Buzzsumo and ahrefs allow you to search for the most shared content by topic and both offer advanced search options.

Further inspiration

There are also several sites I like to look at for inspiration. Whilst these sites don’t give you a great steer on whether or not a particular piece of content was actually successful, with a little digging you can quickly find the original source and pull link and social share data:

  • Visually has a community area where users can upload creative content. You can search by topic to uncover examples.
  • TrendHunter have a searchable archive of creative ideas, they feature products, creative campaigns, marketing campaigns, advertising and more. It’s best to keep your searches broad if you’re looking at this site.
  • Check out Niice (a moodboard app) which also has a searchable archive of handpicked design inspiration.
  • Searching Pinterest can allow you to unearth some interesting bits and pieces as can Google image searches and regular Google searches around particular topics.
  • Reviewing relevant sections of discussion sites like Quora can provide insight into what people are asking about particular topics which may spark a creative idea.

Moving from data to insight

By this point you’ve (hopefully) got a long list of content examples. Whilst this is a great start, effectively what you’ve got here is just data, now you need to convert this to insight.

Remember, we’re trying to answer the questions: “What are websites publishing, and what are people sharing?”

Ordinarily as I go through the creative content research process, I start to see patterns or themes emerge. For example, across a variety of topics areas you’ll see that the most shared content tends to be news. Whilst this is good to know, it’s not necessarily something that’s going to be particularly actionable. You’ll need to dig a little deeper—what else (aside from news) is given coverage? Can you split those things into categories or themes?

This is tough to explain in the abstract, so let me give you an example. We’d identified a set of music sites (e.g. Rolling Stone, NME, CoS, Stereogum, Pitchfork) as target publishers for a client.

Here’s a summary of what I concluded following my research:

The most-shared content on these music publications is news: album launches, new singles, videos of performances etc. As such, if we can work a news hook into whatever we create, this could positively influence our chances of gaining coverage.

Aside from news, the content which gains traction tends to fall into one of the following categories:

Earlier in this post I mentioned that it can be particularly tough to create content which attracts coverage and shares if clients feel strongly that they want to do something directly related to their product or service. The example I gave at the outset was a client who sold insurance and was really keen to create something about insurance. You’re now in a great position to win an argument with data, as thanks to your research you’ll be able to cite several pieces of insurance-related content which have struggled to gain traction. But it’s not all bad news as you’ll also be able to cite other topics which are relevant to the client’s target audience and stand a better chance of gaining coverage and shares.

Avoiding the pitfalls

There are potential pitfalls when it comes to creative content research in that it’s easy to leap to erroneous conclusions. Here’s some things to watch out for:

Make sure you’re identifying outliers…

When seeking out successful pieces of content you need to be certain that what you’re looking at is actually an outlier. For example, the average post on BuzzFeed gets over 30k social shares. As such, that post you found with just 10k shares is not an outlier. It’s done significantly worse than average. It’s therefore not the best post to be holding up as a fabulous example of what to create to get shares.

Don’t get distracted by formats…

Pay more attention to the idea than the format. For example, the folks at Mashable, kindly covered an infographic about Instagram which we created for a client. However, the takeaway here is not that Instagram infographics get coverage on Mashable. Mashable didn’t cover this because we created an infographic. They covered the piece because it told a story in a compelling and unusual way.

You probably shouldn’t create a listicle…

This point is related to the point above. In my experience, unless you’re a publisher with a huge, engaged social following, that listicle of yours is unlikely to gain traction. Listicles on huge publisher sites get shares, listicles on client sites typically don’t. This is doubly important if you’re also seeking coverage, as listicles on clients sites don’t typically get links or coverage on other sites.

How we use the research to inform our ideation process

At Distilled, we typically take a creative brief and complete creative content research and then move into the ideation process. A summary of the research is included within the creative brief, and this, along with a copy of the full creative content research is shared with the team.

The research acts as inspiration and direction and is particularly useful in terms of identifying potential topics to explore but doesn’t mean team members don’t still do further research of their own.

This process by no means acts as a silver bullet, but it definitely helps us come up with ideas.


Thanks for sticking with me to the end!

I’d love to hear more about your creative content research processes and any tips you have for finding inspirational content. Do let me know via the comments.

Image credits: Research, typing, audience, inspiration, kitteh.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]