How to Combat 5 of the SEO World’s Most Infuriating Problems – Whiteboard Friday

Posted by randfish

These days, most of us have learned that spammy techniques aren’t the way to go, and we have a solid sense for the things we should be doing to rank higher, and ahead of our often spammier competitors. Sometimes, maddeningly, it just doesn’t work. In today’s Whiteboard Friday, Rand talks about five things that can infuriate SEOs with the best of intentions, why those problems exist, and what we can do about them.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

What SEO problems make you angry?

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re chatting about some of the most infuriating things in the SEO world, specifically five problems that I think plague a lot of folks and some of the ways that we can combat and address those.

I’m going to start with one of the things that really infuriates a lot of new folks to the field, especially folks who are building new and emerging sites and are doing SEO on them. You have all of these best practices list. You might look at a web developer’s cheat sheet or sort of a guide to on-page and on-site SEO. You go, “Hey, I’m doing it. I’ve got my clean URLs, my good, unique content, my solid keyword targeting, schema markup, useful internal links, my XML sitemap, and my fast load speed. I’m mobile friendly, and I don’t have manipulative links.”

Great. “Where are my results? What benefit am I getting from doing all these things, because I don’t see one?” I took a site that was not particularly SEO friendly, maybe it’s a new site, one I just launched or an emerging site, one that’s sort of slowly growing but not yet a power player. I do all this right stuff, and I don’t get SEO results.

This makes a lot of people stop investing in SEO, stop believing in SEO, and stop wanting to do it. I can understand where you’re coming from. The challenge is not one of you’ve done something wrong. It’s that this stuff, all of these things that you do right, especially things that you do right on your own site or from a best practices perspective, they don’t increase rankings. They don’t. That’s not what they’re designed to do.

1) Following best practices often does nothing for new and emerging sites

This stuff, all of these best practices are designed to protect you from potential problems. They’re designed to make sure that your site is properly optimized so that you can perform to the highest degree that you are able. But this is not actually rank boosting stuff unfortunately. That is very frustrating for many folks. So following a best practices list, the idea is not, “Hey, I’m going to grow my rankings by doing this.”

On the flip side, many folks do these things on larger, more well-established sites, sites that have a lot of ranking signals already in place. They’re bigger brands, they have lots of links to them, and they have lots of users and usage engagement signals. You fix this stuff. You fix stuff that’s already broken, and boom, rankings pop up. Things are going well, and more of your pages are indexed. You’re getting more search traffic, and it feels great. This is a challenge, on our part, of understanding what this stuff does, not a challenge on the search engine’s part of not ranking us properly for having done all of these right things.

2) My competition seems to be ranking on the back of spammy or manipulative links

What’s going on? I thought Google had introduced all these algorithms to kind of shut this stuff down. This seems very frustrating. How are they pulling this off? I look at their link profile, and I see a bunch of the directories, Web 2.0 sites — I love that the spam world decided that that’s Web 2.0 sites — article sites, private blog networks, and do follow blogs.

You look at this stuff and you go, “What is this junk? It’s terrible. Why isn’t Google penalizing them for this?” The answer, the right way to think about this and to come at this is: Are these really the reason that they rank? I think we need to ask ourselves that question.

One thing that we don’t know, that we can never know, is: Have these links been disavowed by our competitor here?

I’ve got my HulksIncredibleStore.com and their evil competitor Hulk-tastrophe.com. Hulk-tastrophe has got all of these terrible links, but maybe they disavowed those links and you would have no idea. Maybe they didn’t build those links. Perhaps those links came in from some other place. They are not responsible. Google is not treating them as responsible for it. They’re not actually what’s helping them.

If they are helping, and it’s possible they are, there are still instances where we’ve seen spam propping up sites. No doubt about it.

I think the next logical question is: Are you willing to loose your site or brand? What we don’t see anymore is we almost never see sites like this, who are ranking on the back of these things and have generally less legitimate and good links, ranking for two or three or four years. You can see it for a few months, maybe even a year, but this stuff is getting hit hard and getting hit frequently. So unless you’re willing to loose your site, pursuing their links is probably not a strategy.

Then what other signals, that you might not be considering potentially links, but also non-linking signals, could be helping them rank? I think a lot of us get blinded in the SEO world by link signals, and we forget to look at things like: Do they have a phenomenal user experience? Are they growing their brand? Are they doing offline kinds of things that are influencing online? Are they gaining engagement from other channels that’s then influencing their SEO? Do they have things coming in that I can’t see? If you don’t ask those questions, you can’t really learn from your competitors, and you just feel the frustration.

3) I have no visibility or understanding of why my rankings go up vs down

On my HulksIncredibleStore.com, I’ve got my infinite stretch shorts, which I don’t know why he never wears — he should really buy those — my soothing herbal tea, and my anger management books. I look at my rankings and they kind of jump up all the time, jump all over the place all the time. Actually, this is pretty normal. I think we’ve done some analyses here, and the average page one search results shift is 1.5 or 2 position changes daily. That’s sort of the MozCast dataset, if I’m recalling correctly. That means that, over the course of a week, it’s not uncommon or unnatural for you to be bouncing around four, five, or six positions up, down, and those kind of things.

I think we should understand what can be behind these things. That’s a very simple list. You made changes, Google made changes, your competitors made changes, or searcher behavior has changed in terms of volume, in terms of what they were engaging with, what they’re clicking on, what their intent behind searches are. Maybe there was just a new movie that came out and in one of the scenes Hulk talks about soothing herbal tea. So now people are searching for very different things than they were before. They want to see the scene. They’re looking for the YouTube video clip and those kind of things. Suddenly Hulk’s soothing herbal tea is no longer directing as well to your site.

So changes like these things can happen. We can’t understand all of them. I think what’s up to us to determine is the degree of analysis and action that’s actually going to provide a return on investment. Looking at these day over day or week over week and throwing up our hands and getting frustrated probably provides very little return on investment. Looking over the long term and saying, “Hey, over the last 6 months, we can observe 26 weeks of ranking change data, and we can see that in aggregate we are now ranking higher and for more keywords than we were previously, and so we’re going to continue pursuing this strategy. This is the set of keywords that we’ve fallen most on, and here are the factors that we’ve identified that are consistent across that group.” I think looking at rankings in aggregate can give us some real positive ROI. Looking at one or two, one week or the next week probably very little ROI.

4) I cannot influence or affect change in my organization because I cannot accurately quantify, predict, or control SEO

That’s true, especially with things like keyword not provided and certainly with the inaccuracy of data that’s provided to us through Google’s Keyword Planner inside of AdWords, for example, and the fact that no one can really control SEO, not fully anyway.

You get up in front of your team, your board, your manager, your client and you say, “Hey, if we don’t do these things, traffic will suffer,” and they go, “Well, you can’t be sure about that, and you can’t perfectly predict it. Last time you told us something, something else happened. So because the data is imperfect, we’d rather spend money on channels that we can perfectly predict, that we can very effectively quantify, and that we can very effectively control.” That is understandable. I think that businesses have a lot of risk aversion naturally, and so wanting to spend time and energy and effort in areas that you can control feels a lot safer.

Some ways to get around this are, first off, know your audience. If you know who you’re talking to in the room, you can often determine the things that will move the needle for them. For example, I find that many managers, many boards, many executives are much more influenced by competitive pressures than they are by, “We won’t do as well as we did before, or we’re loosing out on this potential opportunity.” Saying that is less powerful than saying, “This competitor, who I know we care about and we track ourselves against, is capturing this traffic and here’s how they’re doing it.”

Show multiple scenarios. Many of the SEO presentations that I see and have seen and still see from consultants and from in-house folks come with kind of a single, “Hey, here’s what we predict will happen if we do this or what we predict will happen if we don’t do this.” You’ve got to show multiple scenarios, especially when you know you have error bars because you can’t accurately quantify and predict. You need to show ranges.

So instead of this, I want to see: What happens if we do it a little bit? What happens if we really overinvest? What happens if Google makes a much bigger change on this particular factor than we expect or our competitors do a much bigger investment than we expect? How might those change the numbers?

Then I really do like bringing case studies, especially if you’re a consultant, but even in-house there are so many case studies in SEO on the Web today, you can almost always find someone who’s analogous or nearly analogous and show some of their data, some of the results that they’ve seen. Places like SEMrush, a tool that offers competitive intelligence around rankings, can be great for that. You can show, hey, this media site in our sector made these changes. Look at the delta of keywords they were ranking for versus R over the next six months. Correlation is not causation, but that can be a powerful influencer showing those kind of things.

Then last, but not least, any time you’re going to get up like this and present to a group around these topics, if you very possibly can, try to talk one-on-one with the participants before the meeting actually happens. I have found it almost universally the case that when you get into a group setting, if you haven’t had the discussions beforehand about like, “What are your concerns? What do you think is not valid about this data? Hey, I want to run this by you and get your thoughts before we go to the meeting.” If you don’t do that ahead of time, people can gang up and pile on. One person says, “Hey, I don’t think this is right,” and everybody in the room kind of looks around and goes, “Yeah, I also don’t think that’s right.” Then it just turns into warfare and conflict that you don’t want or need. If you address those things beforehand, then you can include the data, the presentations, and the “I don’t know the answer to this and I know this is important to so and so” in that presentation or in that discussion. It can be hugely helpful. Big difference between winning and losing with that.

5) Google is biasing to big brands. It feels hopeless to compete against them

A lot of people are feeling this hopelessness, hopelessness in SEO about competing against them. I get that pain. In fact, I’ve felt that very strongly for a long time in the SEO world, and I think the trend has only increased. This comes from all sorts of stuff. Brands now have the little dropdown next to their search result listing. There are these brand and entity connections. As Google is using answers and knowledge graph more and more, it’s feeling like those entities are having a bigger influence on where things rank and where they’re visible and where they’re pulling from.

User and usage behavior signals on the rise means that big brands, who have more of those signals, tend to perform better. Brands in the knowledge graph, brands growing links without any effort, they’re just growing links because they’re brands and people point to them naturally. Well, that is all really tough and can be very frustrating.

I think you have a few choices on the table. First off, you can choose to compete with brands where they can’t or won’t. So this is areas like we’re going after these keywords that we know these big brands are not chasing. We’re going after social channels or people on social media that we know big brands aren’t. We’re going after user generated content because they have all these corporate requirements and they won’t invest in that stuff. We’re going after content that they refuse to pursue for one reason or another. That can be very effective.

You better be building, growing, and leveraging your competitive advantage. Whenever you build an organization, you’ve got to say, “Hey, here’s who is out there. This is why we are uniquely better or a uniquely better choice for this set of customers than these other ones.” If you can leverage that, you can generally find opportunities to compete and even to win against big brands. But those things have to become obvious, they have to become well-known, and you need to essentially build some of your brand around those advantages, or they’re not going to give you help in search. That includes media, that includes content, that includes any sort of press and PR you’re doing. That includes how you do your own messaging, all of these things.

(C) You can choose to serve a market or a customer that they don’t or won’t. That can be a powerful way to go about search, because usually search is bifurcated by the customer type. There will be slightly different forms of search queries that are entered by different kinds of customers, and you can pursue one of those that isn’t pursued by the competition.

Last, but not least, I think for everyone in SEO we all realize we’re going to have to become brands ourselves. That means building the signals that are typically associated with brands — authority, recognition from an industry, recognition from a customer set, awareness of our brand even before a search has happened. I talked about this in a previous Whiteboard Friday, but I think because of these things, SEO is becoming a channel that you benefit from as you grow your brand rather than the channel you use to initially build your brand.

All right, everyone. Hope these have been helpful in combating some of these infuriating, frustrating problems and that we’ll see some great comments from you guys. I hope to participate in those as well, and we’ll catch you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Should I Rebrand and Redirect My Site? Should I Consolidate Multiple Sites/Brands? – Whiteboard Friday

Posted by randfish

Making changes to your brand is a huge step, and while it’s sometimes the best path forward, it isn’t one to be taken lightly. In today’s Whiteboard Friday, Rand offers some guidance to marketers who are wondering whether a rebrand/redirect is right for them, and also those who are considering consolidating multiple sites under a single brand.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

To rebrand, or not to rebrand, that is the question

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. Today we’re going to chat a little bit about whether you should rebrand and consider redirecting your existing website or websites and whether you should potentially consolidate multiple websites and brands that you may be running.

So we’ve talked before about redirection moves best practices. We’ve also talked about the splitting of link equity and domain authority and those kinds of things. But one of the questions that people have is, “Gosh, you know I have a website today and given the moves that Google has been making, that the social media world has been making, that content marketing has been making, I’m wondering whether I should potentially rebrand my site.” Lots of people bought domains back in the day that were exact match domains or partial match domains or that they thought reflected a move of the web toward or away from less brand-centric stuff and toward more keyword matching, topic matching, intent matching kinds of things.

Maybe you’re reconsidering those moves and you want to know, “Hey, should I be thinking about making a change now?” That’s what I’m here to answer. So this question to rebrand or not to re, it is tough because you know that when you do that rebrand, you will almost certainly take a traffic hit, and SEO is one of the biggest places where people typically take that traffic hit.

Moz previously was at SEOmoz.org and moved to moz.com. We saw a dip in our traffic over about 3 to 4 months before it fully recovered, and I would say that dip was between 15% and 25% of our search traffic, depending on week to week. I’ll link to a list of metrics that I put on my personal blog, Moz.com/rand, so that you can check those out if you’d like to see them. But it was a short recovery time for us.

One of the questions that people always have is, “Well wait, did you lose rankings for SEO since SEO used to be in your domain name?” The answer is no. In fact, six months after the move, we were ranking higher for SEO related terms and phrases.

Scenario A: Rebranding or redirecting scifitoysandgames.com

So let’s imagine that today you are running SciFiToysAndGames.com, which is right on the borderline. In my opinion, that’s right on the borderline of barely tolerable. Like it could be brandable, but it’s not great. I don’t love the “sci-fi” in here, partially because of how the Syfy channel, the entity that broadcasts stuff on television has chosen to delineate their spelling, sci-fi can be misinterpreted as to how it’s spelled. I don’t love having to have “and” in a domain name. This is long. All sorts of stuff.

Let’s say you also own StarToys.com, but you haven’t used it. Previously StarToys.com has been redirecting to SciFiToysAndGames.com, and you’re thinking, “Well, man, is it the right time to make this move? Should I make this change now? Should I wait for the future?”

How memorable or amplifiable is your current brand?

Well, these are the questions that I would urge you to consider. How memorable and amplifiable is your current brand? That’s something that if you are recognizing like, “Hey I think our brand name, in fact, is holding us back in search results and social media amplification, press, in blog mentions, in journalist links and these kinds of things,” well, that’s something serious to think about. Word of mouth too.

Will you maintain your current brand name long term?

So if you know that sometime in the next two, three, four, or five years you do want to move to StarToys, I would actually strongly urge you to do that right now, because the longer you wait, the longer it will take to build up the signals around the new domain and the more pain you’ll potentially incur by having to keep branding this and working on this old brand name. So I would strongly urge you, if you know you’re going to make the move eventually, make it today. Take the pain now, rather than more pain later.

Can or have you tested brand preference with your target audience?

I would urge you to find two different groups, one who are loyal customers today, people who know SciFiToysAndGames.com and have used it, and two, people who are potential customers, but aren’t yet familiar with it.

You don’t need to do big sample-sizes. If you can get 5, 10, or 15 people either in a room or talk to them in person, you can try some web surveys, you can try using some social media ads like things on Facebook. I’ve seen some companies do some testing around this. Even buying potential PPC ads and seeing how click-through rates perform and sentiment and those kinds of things, that is a great way to help validate your ideas, especially if you’re forced to bring data to a table by executives or other stakeholders.

How much traffic would you need in one year to justify a URL move?

The last thing I think about is imagine, and I want you to either imagine or even model this out, mathematically model it out. If your traffic growth rate — so let’s say you’re growing at 10% year-over-year right now — if that improved 1%, 5%, or 10% annually with a new brand name, would you make the move? So knowing that you might take a short-term hit, but then that your growth rate would be incrementally higher in years to come, how big would that growth rate need to be?

I would say that, in general, if I were thinking about these two domains, granted this is a hard case because you don’t know exactly how much more brandable or word-of-mouth-able or amplifiable your new one might be compared to your existing one. Well, gosh, my general thing here is if you think that’s going to be a substantive percentage, say 5% plus, almost always it’s worth it, because compound growth rate over a number of years will mean that you’re winning big time. Remember that that growth rate is different that raw growth. If you can incrementally increase your growth rate, you get tremendously more traffic when you look back two, three, four, or five years later.

Where does your current and future URL live on the domain/brand name spectrum?

I also made this domain name, brand name spectrum, because I wanted to try and visualize crappiness of domain name, brand name to really good domain name, brand name. I wanted to give some examples and then extract out some elements so that maybe you can start to build on these things thematically as you’re considering your own domains.

So from awful, we go to tolerable, good, and great. So Science-Fi-Toys.net is obviously terrible. I’ve taken a contraction of the name and the actual one. It’s got a .net. It’s using hyphens. It’s infinitely unmemorable up to what I think is tolerable — SciFiToysAndGames.com. It’s long. There are some questions about how type-in-able it is, how easy it is to type in. SciFiToys.com, which that’s pretty good. SciFiToys, relatively short, concise. It still has the “sci-fi” in there, but it’s a .com. We’re getting better. All the way up to, I really love the name, StarToys. I think it’s very brandable, very memorable. It’s concise. It’s easy to remember and type in. It has positive associations probably with most science fiction toy buyers who are familiar with at least “Star Wars” or “Star Trek.” It’s cool. It has some astronomy connotations too. Just a lot of good stuff going on with that domain name.

Then, another one, Region-Data-API.com. That sucks. NeighborhoodInfo.com. Okay, at least I know what it is. Neighborhood is a really hard name to type because it is very hard for many people to spell and remember. It’s long. I don’t totally love it. I don’t love the “info” connotation, which is generic-y.

DistrictData.com has a nice, alliterative ring to it. But maybe we could do even better and actually there is a company, WalkScore.com, which I think is wonderfully brandable and memorable and really describes what it is without being too in your face about the generic brand of we have regional data about places.

What if you’re doing mobile apps? BestAndroidApps.com. You might say, “Why is that in awful?” The answer is two things. One, it’s the length of the domain name and then the fact that you’re actually using someone else’s trademark in your name, which can be really risky. Especially if you start blowing up, getting big, Google might go and say, “Oh, do you have Android in your domain name? We’ll take that please. Thank you very much.”

BestApps.io, in the tech world, it’s very popular to use domains like .io or .ly. Unfortunately, I think once you venture outside of the high tech world, it’s really tough to get people to remember that that is a domain name. If you put up a billboard that says “BestApps.com,” a majority of people will go, “Oh, that’s a website.” But if you use .io, .ly, or one of the new domain names, .ninja, a lot of people won’t even know to connect that up with, “Oh, they mean an Internet website that I can type into my browser or look for.”

So we have to remember that we sometimes live in a bubble. Outside of that bubble are a lot of people who, if it’s not .com, questionable as to whether they’re even going to know what it is. Remember outside of the U.S., country code domain names work equally well — .co.uk, .ca, .co.za, wherever you are.

InstallThis.com. Now we’re getting better. Memorable, clear. Then all the way up to, I really like the name AppCritic.com. I have positive associations with like, “Oh year, restaurant critics, food critics, and movie critics, and this is an app critic. Great, that’s very cool.”

What are the things that are in here? Well, stuff at this end of the spectrum tends to be generic, forgettable, hard to type in. It’s long, brand-infringing, danger, danger, and sketchy sounding. It’s hard to quantify what sketchy sounding is, but you know it when you see it. When you’re reviewing domain names, you’re looking for links, you’re looking at things in the SERPs, you’re like, “Hmm, I don’t know about this one.” Having that sixth sense is something that we all develop over time, so sketchy sounding not quite as scientific as I might want for a description, but powerful.

On this end of the spectrum though, domain names and brand names tend to be unique, memorable, short. They use .com. Unfortunately, still the gold standard. Easy to type in, pronounceable. That’s a powerful thing too, especially because of word of mouth. We suffered with that for a long time with SEOmoz because many people saw it and thought, “Oh, ShowMoz, COMoz, SeeMoz.” It sucked. Have positive associations, like StarToys or WalkScore or AppCritic. They have these positive, pre-built-in associations psychologically that suggest something brandable.

Scenario B: Consolidating two sites

Scenario B, and then we’ll get to the end, but scenario B is the question like, “Should I consolidate?” Let’s say I’m running both of these today. Or more realistic and many times I see people like this, you’re running AppCritic.com and StarToys.com, and you think, “Boy, these are pretty separate.” But then you keep finding overlap between them. Your content tends to overlap, the audience tends to overlap. I find this with many, many folks who run multiple domains.

How much audience and content overlap is there?

So we’ve got to consider a few things. First off, that audience and content overlap. If you’ve got StarToys and AppCritic and the overlap is very thin, just that little, tiny piece in the middle there. The content doesn’t overlap much, the audience doesn’t overlap much. It probably doesn’t make that much sense.

But what if you’re finding like, “Gosh, man, we’re writing more and more about apps and tech and mobile and web stuff on StarToys, and we’re writing more and more about other kinds of geeky, fun things on AppCritic. Slowly it feels like these audiences are merging.” Well, now you might want to consider that consolidation.

Is there potential for separate sales or exits?

Second point of consideration, the potential for separate exits or sales. So if you know that you’re going to sell AppCritic.com to someone in the future and you want to make sure that’s separate from StarToys, you should keep them separate. If you think to yourself, “Gosh, I’d never sell one without the other. They’re really part of the same company, brand, effort,” well, I’d really consider that consolidation.

Will you dilute marketing or branding efforts?

Last point of positive consideration is dilution of marketing and branding efforts. Remember that you’re going to be working on marketing. You’re going to be working on branding. You’re going to be working on growing traffic to these. When you split your efforts, unless you have two relatively large, separate teams, this is very, very hard to do at the same rate that it could be done if you combined those efforts. So another big point of consideration. That compound growth rate that we talked about, that’s another big consideration with this.

Is the topical focus out of context?

What I don’t recommend you consider and what has been unfortunately considered, by a lot of folks in the SEO-centric world in the past, is topical focus of the content. I actually am crossing this out. Not a big consideration. You might say to yourself, “But Rand, we talked about previously on Whiteboard Friday how I can have topical authority around toys and games that are related to science fiction stuff, and I can have topical authority related to mobile apps.”

My answer is if the content overlap is strong and the audience overlap is strong, you can do both on one domain. You can see many, many examples of this across the web, Moz being a great example where we talk about startups and technology and sometimes venture capital and team building and broad marketing and paid search marketing and organic search marketing and just a ton of topics, but all serving the same audience and content. Because that overlap is strong, we can be an authority in all of these realms. Same goes for any time you’re considering these things.

All right everyone, hope you’ve enjoyed this edition of Whiteboard Friday. I look forward to some great comments, and we’ll see you again next week. take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Using Term Frequency Analysis to Measure Your Content Quality

Posted by EricEnge

It’s time to look at your content differently—time to start understanding just how good it really is. I am not simply talking about titles, keyword usage, and meta descriptions. I am talking about the entire page experience. In today’s post, I am going to introduce the general concept of content quality analysis, why it should matter to you, and how to use term frequency (TF) analysis to gather ideas on how to improve your content.

TF analysis is usually combined with inverse document frequency analysis (collectively TF-IDF analysis). TF-IDF analysis has been a staple concept for information retrieval science for a long time. You can read more about TF-IDF and other search science concepts in Cyrus Shepard’s
excellent article here.

For purposes of today’s post, I am going to show you how you can use TF analysis to get clues as to what Google is valuing in the content of sites that currently outrank you. But first, let’s get oriented.

Conceptualizing page quality

Start by asking yourself if your page provides a quality experience to people who visit it. For example, if a search engine sends 100 people to your page, how many of them will be happy? Seventy percent? Thirty percent? Less? What if your competitor’s page gets a higher percentage of happy users than yours does? Does that feel like an “uh-oh”?

Let’s think about this with a specific example in mind. What if you ran a golf club site, and 100 people come to your page after searching on a phrase like “golf clubs.” What are the kinds of things they may be looking for?

Here are some things they might want:

  1. A way to buy golf clubs on your site (you would need to see a shopping cart of some sort).
  2. The ability to select specific brands, perhaps by links to other pages about those brands of golf clubs.
  3. Information on how to pick the club that is best for them.
  4. The ability to select specific types of clubs (drivers, putters, irons, etc.). Again, this may be via links to other pages.
  5. A site search box.
  6. Pricing info.
  7. Info on shipping costs.
  8. Expert analysis comparing different golf club brands.
  9. End user reviews of your company so they can determine if they want to do business with you.
  10. How your return policy works.
  11. How they can file a complaint.
  12. Information about your company. Perhaps an “about us” page.
  13. A link to a privacy policy page.
  14. Whether or not you have been “in the news” recently.
  15. Trust symbols that show that you are a reputable organization.
  16. A way to access pages to buy different products, such as golf balls or tees.
  17. Information about specific golf courses.
  18. Tips on how to improve their golf game.

This is really only a partial list, and the specifics of your site can certainly vary for any number of reasons from what I laid out above. So how do you figure out what it is that people really want? You could pull in data from a number of sources. For example, using data from your site search box can be invaluable. You can do user testing on your site. You can conduct surveys. These are all good sources of data.

You can also look at your analytics data to see what pages get visited the most. Just be careful how you use that data. For example, if most of your traffic is from search, this data will be biased by incoming search traffic, and hence what Google chooses to rank. In addition, you may only have a small percentage of the visitors to your site going to your privacy policy, but chances are good that there are significantly more users than that who notice whether or not you have a privacy policy. Many of these will be satisfied just to see that you have one and won’t actually go check it out.

Whatever you do, it’s worth using many of these methods to determine what users want from the pages of your site and then using the resulting information to improve your overall site experience.

Is Google using this type of info as a ranking factor?

At some level, they clearly are. Clearly Google and Bing have evolved far beyond the initial TF-IDF concepts, but we can still use them to better understand our own content.

The first major indication we had that Google was performing content quality analysis was with the release of the
Panda algorithm in February of 2011. More recently, we know that on April 21 Google will release an algorithm that makes the mobile friendliness of a web site a ranking factor. Pure and simple, this algo is about the user experience with a page.

Exactly how Google is performing these measurements is not known, but
what we do know is their intent. They want to make their search engine look good, largely because it helps them make more money. Sending users to pages that make them happy will do that. Google has every incentive to improve the quality of their search results in as many ways as they can.

Ultimately, we don’t actually know what Google is measuring and using. It may be that the only SEO impact of providing pages that satisfy a very high percentage of users is an indirect one. I.e., so many people like your site that it gets written about more, linked to more, has tons of social shares, gets great engagement, that Google sees other signals that it uses as ranking factors, and this is why your rankings improve.

But, do I care if the impact is a direct one or an indirect one? Well, NO.

Using TF analysis to evaluate your page

TF-IDF analysis is more about relevance than content quality, but we can still use various precepts from it to help us understand our own content quality. One way to do this is to compare the results of a TF analysis of all the keywords on your page with those pages that currently outrank you in the search results. In this section, I am going to outline the basic concepts for how you can do this. In the next section I will show you a process that you can use with publicly available tools and a spreadsheet.

The simplest form of TF analysis is to count the number of uses of each keyword on a page. However, the problem with that is that a page using a keyword 10 times will be seen as 10 times more valuable than a page that uses a keyword only once. For that reason, we dampen the calculations. I have seen two methods for doing this, as follows:

term frequency calculation

The first method relies on dividing the number of repetitions of a keyword by the count for the most popular word on the entire page. Basically, what this does is eliminate the inherent advantage that longer documents might otherwise have over shorter ones. The second method dampens the total impact in a different way, by taking the log base 10 for the actual keyword count. Both of these achieve the effect of still valuing incremental uses of a keyword, but dampening it substantially. I prefer to use method 1, but you can use either method for our purposes here.

Once you have the TF calculated for every different keyword found on your page, you can then start to do the same analysis for pages that outrank you for a given search term. If you were to do this for five competing pages, the result might look something like this:

term frequency spreadsheet

I will show you how to set up the spreadsheet later, but for now, let’s do the fun part, which is to figure out how to analyze the results. Here are some of the things to look for:

  1. Are there any highly related words that all or most of your competitors are using that you don’t use at all?
  2. Are there any such words that you use significantly less, on average, than your competitors?
  3. Also look for words that you use significantly more than competitors.

You can then tag these words for further analysis. Once you are done, your spreadsheet may now look like this:

second stage term frequency analysis spreadsheet

In order to make this fit into this screen shot above and keep it legibly, I eliminated some columns you saw in my first spreadsheet. However, I did a sample analysis for the movie “Woman in Gold”. You can see the
full spreadsheet of calculations here. Note that we used an automated approach to marking some items at “Low Ratio,” “High Ratio,” or “All Competitors Have, Client Does Not.”

None of these flags by themselves have meaning, so you now need to put all of this into context. In our example, the following words probably have no significance at all: “get”, “you”, “top”, “see”, “we”, “all”, “but”, and other words of this type. These are just very basic English language words.

But, we can see other things of note relating to the target page (a.k.a. the client page):

  1. It’s missing any mention of actor ryan reynolds
  2. It’s missing any mention of actor helen mirren
  3. The page has no reviews
  4. Words like “family” and “story” are not mentioned
  5. “Austrian” and “maria altmann” are not used at all
  6. The phrase “woman in gold” and words “billing” and “info” are used proportionally more than they are with the other pages

Note that the last item is only visible if you open
the spreadsheet. The issues above could well be significant, as the lead actors, reviews, and other indications that the page has in-depth content. We see that competing pages that rank have details of the story, so that’s an indication that this is what Google (and users) are looking for. The fact that the main key phrase, and the word “billing”, are used to a proportionally high degree also makes it seem a bit spammy.

In fact, if you look at the information closely, you can see that the target page is quite thin in overall content. So much so, that it almost looks like a doorway page. In fact, it looks like it was put together by the movie studio itself, just not very well, as it presents little in the way of a home page experience that would cause it to rank for the name of the movie!

In the many different times I have done an analysis using these methods, I’ve been able to make many different types of observations about pages. A few of the more interesting ones include:

  1. A page that had no privacy policy, yet was taking personally identifiable info from users.
  2. A major lack of important synonyms that would indicate a real depth of available content.
  3. Comparatively low Domain Authority competitors ranking with in-depth content.

These types of observations are interesting and valuable, but it’s important to stress that you shouldn’t be overly mechanical about this. The value in this type of analysis is that it gives you a technical way to compare the content on your page with that of your competitors. This type of analysis should be used in combination with other methods that you use for evaluating that same page. I’ll address this some more in the summary section of this below.

How do you execute this for yourself?

The
full spreadsheet contains all the formulas so all you need to do is link in the keyword count data. I have tried this with two different keyword density tools, the one from Searchmetrics, and this one from motoricerca.info.

I am not endorsing these tools, and I have no financial interest in either one—they just seemed to work fairly well for the process I outlined above. To provide the data in the right format, please do the following:

  1. Run all the URLs you are testing through the keyword density tool.
  2. Copy and paste all the one word, two word, and three word results into a tab on the spreadsheet.
  3. Sort them all so you get total word counts aligned by position as I have shown in the linked spreadsheet.
  4. Set up the formulas as I did in the demo spreadsheet (you can just use the demo spreadsheet).
  5. Then do your analysis!

This may sound a bit tedious (and it is), but it has worked very well for us at STC.

Summary

You can also use usability groups and a number of other methods to figure out what users are really looking for on your site. However, what this does is give us a look at what Google has chosen to rank the highest in its search results. Don’t treat this as some sort of magic formula where you mechanically tweak the content to get better metrics in this analysis.

Instead, use this as a method for slicing into your content to better see it the way a machine might see it. It can yield some surprising (and wonderful) insights!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Illustrated Guide to Advanced On-Page Topic Targeting for SEO

Posted by Cyrus-Shepard

Topic n. A subject or theme of a webpage, section, or site.

Several SEOs have recently written about topic modeling and advanced on-page optimization. A few of note:

The concepts themselves are dizzying: LDA, co-occurrence, and entity salience, to name only a few. The question is
“How can I easily incorporate these techniques into my content for higher rankings?”

In fact, you can create optimized pages without understanding complex algorithms. Sites like Wikipedia, IMDB, and Amazon create highly optimized, topic-focused pages almost by default. Utilizing these best practices works exactly the same when you’re creating your own content.

The purpose of this post is to provide a simple
framework for on-page topic targeting in a way that makes optimizing easy and scalable while producing richer content for your audience.

1. Keywords and relationships

No matter what topic modeling technique you choose, all rely on discovering
relationships between words and phrases. As content creators, how we organize words on a page greatly influences how search engines determine the on-page topics.

When we use keywords phrases, search engines hunt for other phrases and concepts that
relate to one another. So our first job is to expand our keywords research to incorporate these related phrases and concepts. Contextually rich content includes:

  • Close variants and synonyms: Includes abbreviations, plurals, and phrases that mean the same thing.
  • Primary related keywords: Words and phrases that relate to the main keyword phrase.
  • Secondary related keywords: Words and phrases that relate to the primary related keywords.
  • Entity relationships: Concept that describe the properties and relationships between people, places, and things. 

A good keyword phrase or entity is one that
predicts the presence of other phrases and entities on the page. For example, a page about “The White House” predicts other phrases like “president,” “Washington,” and “Secret Service.” Incorporating these related phrases may help strengthen the topicality of “White House.”

2. Position, frequency, and distance

How a page is organized can greatly influence how concepts relate to each other.

Once search engines find your keywords on a page, they need to determine which ones are most
important, and which ones actually have the strongest relationships to one another.

Three primary techniques for communicating this include:

  • Position: Keywords placed in important areas like titles, headlines, and higher up in the main body text may carry the most weight.
  • Frequency: Using techniques like TF-IDF, search engines determine important phrases by calculating how often they appear in a document compared to a normal distribution.
  • Distance: Words and phrases that relate to each other are often found close together, or grouped by HTML elements. This means leveraging semantic distance to place related concepts close to one another using paragraphs, lists, and content sectioning.

A great way to organize your on-page content is to employ your primary and secondary related keywords in support of your focus keyword. Each primary related phrase becomes its own subsection, with the secondary related phrases supporting the primary, as illustrated here.

Keyword Position, Frequency and Distance

As an example, the primary keyword phrase of this page is ‘On-page Topic Targeting‘. Supporting topics include: keywords and relationships, on-page optimization, links, entities, and keyword tools. Each related phrase supports the primary topic, and each becomes its own subsection.

3. Links and supplemental content

Many webmasters overlook the importance of linking as a topic signal.

Several well-known Google
search patents and early research papers describe analyzing a page’s links as a way to determine topic relevancy. These include both internal links to your own pages and external links to other sites, often with relevant anchor text.

Google’s own
Quality Rater Guidelines cites the value external references to other sites. It also describes a page’s supplemental content, which can includes internal links to other sections of your site, as a valuable resource.

Links and Supplemental Content

If you need an example of how relevant linking can help your SEO,
The New York Times
famously saw success, and an increase in traffic, when it started linking out to other sites from its topic pages.

Although this guide discusses
on-page topic optimization, topical external links with relevant anchor text can greatly influence how search engines determine what a page is about. These external signals often carry more weight than on-page cues, but it almost always works best when on-page and off-page signals are in alignment.

4. Entities and semantic markup

Google extracts entities from your webpage automatically,
without any effort on your part. These are people, places and things that have distinct properties and relationships with each other.

• Christopher Nolan (entity, person) stands 5’4″ (property, height) and directed Interstellar (entity, movie)

Even though entity extraction happens automatically, it’s often essential to mark up your content with
Schema for specific supported entities such as business information, reviews, and products. While the ranking benefit of adding Schema isn’t 100% clear, structured data has the advantage of enhanced search results.

Entities and Schema

For a solid guide in implementing schema.org markup, see Builtvisible’s excellent
guide to rich snippets.

5. Crafting the on-page framework

You don’t need to be a search genius or spend hours on complex research to produce high quality, topic optimized content. The beauty of this framework is that it can be used by anyone, from librarians to hobby bloggers to small business owners; even when they aren’t search engine experts.

A good webpage has much in common with a high quality university paper. This includes:

  1. A strong title that communicates the topic
  2. Introductory opening that lays out what the page is about
  3. Content organized into thematic subsections
  4. Exploration of multiple aspects of the topic and answers related questions
  5. Provision of additional resources and external citations

Your webpage doesn’t need to be academic, stuffy, or boring. Some of the most interesting pages on the Internet employ these same techniques while remaining dynamic and entertaining.

Keep in mind that ‘best practices’ don’t apply to every situation, and as
Rand Fishkin says “There’s no such thing as ‘perfectly optimized’ or ‘perfect on-page SEO.'” Pulling everything together looks something like this:

On-page Topic Targeting for SEO

This graphic is highly inspired by Rand Fishkin’s great
Visual Guide to Keyword Targeting and On-Page SEO. This guide doesn’t replace that canonical resource. Instead, it should be considered a supplement to it.

5 alternative tools for related keyword and entity research

For the search professional, there are dozens of tools available for thematic keyword and entity research. This list is not exhaustive by any means, but contains many useful favorites.

1.
Alchemy API

One of the few tools on the market that delivers entity extraction, concept targeting and linked data analysis. This is a great platform for understanding how a modern search engine views your webpage.

2.
SEO Review Tools

The SEO Keyword Suggestion Tools was actually designed to return both primary and secondary related keywords, as well as options for synonyms and country targeting. 

3.
LSIKeywords.com

The LSIKeyword tool performs Latent Semantic Indexing (LSI) on the top pages returned by Google for any given keyword phrase. The tool can go down from time to time, but it’s a great one to bookmark.

4.
Social Mention

Quick and easy, enter any keyword phrase and then check “Top Keywords” to see what words appear most with your primary phrase across the of the platforms that Social Mention monitors. 

5.
Google Trends

Google trends is a powerful related research tool, if you know how to use it. The secret is downloading your results to a CSV (under settings) to get a list up to 50 related keywords per search term.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 5 years ago from feedproxy.google.com

The Danger of Crossing Algorithms: Uncovering The Cloaked Panda Update During Penguin 3.0

Posted by GlennGabe

Penguin 3.0 was one of the most anticipated algorithm updates in recent years when it rolled out on October 17, 2014. Penguin hadn’t run for over a year at that point,
and there were many webmasters sitting in Penguin limbo waiting for recovery. They had cleaned up their link profiles, disavowed what they could, and were
simply waiting for the next update or refresh. Unfortunately, Google was wrestling with the algo internally and over twelve months passed without an
update.

So when Pierre Far finally
announced Penguin 3.0 a few days later on October 21, a few things
stood out. First, this was
not a new algorithm like Gary Illyes had explained it would be at SMX East. It was a refresh and underscored
the potential problems Google was battling with Penguin (cough, negative SEO).

Second, we were not seeing the impact that we expected. The rollout seemed to begin with a heavier international focus and the overall U.S impact has been
underwhelming to say the least. There were definitely many fresh hits globally, but there were a number of websites that should have recovered but didn’t
for some reason. And many are still waiting for recovery today.

Third, the rollout would be slow and steady and could take weeks to fully complete. That’s unusual, but makes sense given the microscope Penguin 3.0 was
under. And this third point (the extended rollout) is even more important than most people think. Many webmasters are already confused when they get hit
during an acute algorithm update (for example, when an algo update rolls out on one day). But the confusion gets exponentially worse when there is an
extended rollout.

The more time that goes by between the initial launch and the impact a website experiences, the more questions pop up. Was it Penguin 3.0 or was it
something else? Since I work heavily with algorithm updates, I’ve heard similar questions many times over the past several years. And the extended Penguin
3.0 rollout is a great example of why confusion can set in. That’s my focus today.


Penguin, Pirate, and the anomaly on October 24

With the Penguin 3.0 rollout, we also had
Pirate 2 rolling out. And yes, there are
some websites that could be impacted by both. That added a layer of complexity to the situation, but nothing like what was about to hit. You see, I picked
up a very a strange anomaly on October 24. And I clearly saw serious movement on that day (starting late in the day ET).

So, if there was a third algorithm update, then that’s
three potential algo updates rolling out at the same time. More about this soon,
but it underscores the confusion that can set in when we see extended rollouts, with a mix of confirmed and unconfirmed updates.


Penguin 3.0 tremors and analysis

Since I do a lot of Penguin work, and have researched many domains impacted by Penguin in the past, I heavily studied the Penguin 3.0 rollout. I 
published a blog post based on the first ten days of the update, which included some interesting findings for sure.

And based on the extended rollout, I definitely saw Penguin tremors beyond the initial October 17 launch. For example, check out the screenshot below of a
website seeing Penguin impact on October 17, 22, and 25.

But as mentioned earlier, something else happened on October 24 that set off sirens in my office. I started to see serious movement on sites impacted by
Panda, and not Penguin. And when I say serious movement, I’m referring to major traffic gains or losses all starting on October 24. Again, these were sites heavily dealing with Panda and had
clean link profiles. Check out the trending below from October 24 for several
sites that saw impact.


A good day for a Panda victim:



A bad day for a Panda victim:



And an incredibly frustrating day for a 9/5 recovery that went south on 10/24:

I saw this enough that I tweeted heavily about it and
included a section about Panda in my Penguin 3.0 blog post. And
that’s when something wonderful happened, and it highlights the true beauty and power of the internet.

As more people saw my tweets and read my post, I started receiving messages from other webmasters explaining that
they saw the same exact thing, and on their websites dealing with Panda and not Penguin. And not only
did they tell me about, they
showed me the impact.

I received emails containing screenshots and tweets with photos from Google Analytics and Google Webmaster Tools. It was amazing to see, and it confirmed
that we had just experienced a Panda update in the middle of a multi-week Penguin rollout. Yes, read that line again. Panda during Penguin, right when the
internet world was clearly focused on Penguin 3.0.

That was a sneaky move Google… very sneaky. 🙂

So, based on what I explained earlier about webmaster confusion and algorithms, can you tell what happened next? Yes, massive confusion ensued. We had the
trifecta of algorithm updates with Penguin, Pirate, and now Panda.


Webmaster confusion and a reminder of the algo sandwich from 2012

So, we had a major algorithm update during two other major algorithm updates (Penguin and Pirate) and webmaster confusion was hitting extremely high
levels. And I don’t blame anyone for being confused. I’m neck deep in this stuff and it confused me at first.

Was the October 24 update a Penguin tremor or was this something else? Could it be Pirate? And if it was indeed Panda, it would have been great if Google told
us it was Panda! Or did they want to throw off SEOs analyzing Penguin and Pirate? Does anyone have a padded room I can crawl into?

Once I realized this was Panda, and started to communicate the update via Twitter and my blog, I had a number of people ask me a very important question:


“Glenn, would Google really roll out two or three algorithm updates so close together, or at the same time?”

Why yes, they would. Anyone remember the algorithm sandwich from April of 2012? That’s when Google rolled out Panda on April 19, then Penguin 1.0 on April 24,
followed by Panda on April 27. Yes, we had three algorithm updates all within ten days. And let’s not forget that the Penguin update on April 24, 2012 was the
first of its kind! So yes, Google can, and will, roll out multiple major algos around the same time.

Where are we headed? It’s fascinating, but not pretty


Panda is near real-time now

When Panda 4.1 rolled out on September 23, 2014, I immediately disliked the title and version number of the update. Danny Sullivan named it 4.1, so it stuck. But for
me, that was not 4.1… not even close. It was more like 4.75. You see, there have been a number of Panda tremors and updates since P4.0 on May 20,
2014.

I saw what I was calling “tremors”
nearly weekly based on having access to a large amount of Panda data (across sites, categories, and countries).
And based on what I was seeing, I reached out to John Mueller at Google to clarify the tremors. John’s response was great and confirmed what I was seeing.
He explained that there
was not a set frequency for algorithms like Panda. Google can roll out an algorithm, analyze the
SERPs, refine the algo to get the desired results, and keep pushing it out. And that’s exactly what I was seeing (again, almost weekly since Panda 4.0).


When Panda and Penguin meet in real time…

…they will have a cup of coffee and laugh at us. 🙂 So, since Panda is near-real time, the crossing of major algorithm updates is going to happen.
And we just experienced an important one on October 24 with Penguin, Pirate, and Panda. But it could (and probably will) get more chaotic than what we have now.
We are quickly approaching a time where major algorithm updates crafted in a lab will be unleashed on the web in near-real time or in actual real time.

And if organic search traffic from Google is important to you, then pay attention. We’re about to take a quick trip into the future of Google and SEO. And
after hearing what I have to say, you might just want the past back…


Google’s brilliant object-oriented approach to fighting webspam

I have presented at the past two SES conferences about Panda, Penguin, and other miscellaneous disturbances in the force. More about those “other
disturbances” soon. In my presentation, one of my slides looks like this:

Over the past several years, Google has been using a brilliant, object-oriented approach to fighting webspam and low quality content. Webspam engineers can
craft external algorithms in a lab and then inject them into the real-time algorithm whenever they want. It’s brilliant because it isolates specific
problems, while also being extremely scalable. And by the way, it should scare the heck out of anyone breaking the rules.

For example, we have Panda, Penguin, Pirate, and Above the Fold. Each was crafted to target a specific problem and can be unleashed on the web whenever
Google wants. Sure, there are undoubtedly connections between them (either directly or indirectly), but each specific algo is its own black box. Again,
it’s object-oriented.

Now, Panda is a great example of an algorithm that has matured to where Google highly trusts it. That’s why Google announced in June of 2013 that Panda
would roll out monthly, over ten days. And that’s also why it matured even more with Panda 4.0 (and why I’ve seen tremors almost weekly.)

And then we had Gary Illyes explain that Penguin was moving along the same path. At SMX East,
Gary explained that the new Penguin algorithm (which clearly didn’t roll out on October 17) would be structured in a way where subsequent updates could be rolled out more easily.
You know, like Panda.

And by the way, what if this happens to Pirate, Above the Fold, and other algorithms that Google is crafting in its Frankenstein lab? Well my friends, then
we’ll have absolute chaos and society as we know it will crumble. OK, that’s a bit dramatic, but you get my point.

We already have massive confusion now… and a glimpse into the future reveals a continual flow of major algorithms running in real-time, each that
could pummel a site to the ground. And of course, with little or no sign of which algo actually caused the destruction. I don’t know about you, but I just
broke out in hives. 🙂


Actual example of what (near) real-time updates can do

After Panda 4.0, I saw some very strange Panda movement for sites impacted by recent updates. And it underscores the power of near-real time algo updates.
As a quick example,
temporary Panda recoveries can happen if you
don’t get out of the gray area enough. And now that we are seeing Panda tremors almost weekly, you can experience potential turbulence several times per
month.

Here is a screenshot from a site that recovered from Panda, didn’t get out of the gray area and reentered the strike zone, just five days later.

Holy cow, that was fast. I hope they didn’t plan any expensive trips in the near future. This is exactly what can happen when major algorithms roam the web
in real time. One week you’re looking good and the next week you’re in the dumps. Now, at least I knew this was Panda. The webmaster could tackle more
content problems and get out of the gray area… But the ups and downs of a Panda roller coaster ride can drive a webmaster insane. It’s one of the
reasons I recommend making
significant changes when
you’ve been hit by Panda. Get as far out of the gray area as possible.


An “automatic action viewer” in Google Webmaster Tools could help (and it’s actually being discussed internally by Google)

Based on webmaster confusion, many have asked Google to create an “automatic action viewer” in Google Webmaster Tools. It would be similar to the “manual
actions viewer,” but focused on algorithms that are demoting websites in the search results (versus penalties). Yes, there is a difference by the way.

The new viewer would help webmasters better understand the types of problems that are being impacted by algorithms like Panda, Penguin, Pirate, Above the
Fold, and others. Needless to say, this would be incredibly helpful to webmasters, business owners, and SEOs.

So, will we see that viewer any time soon? Google’s John Mueller
addressed this question during the November 3 webmaster hangout (at 38:30).

John explained they are trying to figure something out, but it’s not easy. There are so many algorithms running that they don’t want to provide feedback
that is vague or misleading. But, John did say they are discussing the automatic action viewer internally. So you never know…


A quick note about Matt Cutts

As many of you know, Matt Cutts took an extended leave this past summer (through the end of October). Well, he announced on Halloween that he is
extending his leave into 2015. I won’t go crazy here talking about his decision overall, but I will
focus on how this impacts webmasters as it relates to algorithm updates and webspam.

Matt does a lot more than just announce major algo updates… He actually gets involved when collateral damage rears its ugly head. And there’s not a
faster way to rectify a flawed algo update than to have Mr. Cutts involved. So before you dismiss Matt’s extended leave as uneventful, take a look at the
trending below:

Notice the temporary drop off a cliff, then 14 days of hell, only to see that traffic return? That’s because Matt got involved. That’s the
movie blog fiasco from early 2014 that I heavily analyzed. If
Matt was not notified of the drop via Twitter, and didn’t take action, I’m not sure the movie blogs that got hit would be around today. I told Peter from
SlashFilm that his fellow movie blog owners should all pay him a bonus this year. He’s the one that pinged Matt via Twitter and got the ball rolling.

It’s just one example of how having someone with power out front can nip potential problems in the bud. Sure, the sites experienced two weeks of utter
horror, but traffic returned once Google rectified the problem. Now that Matt isn’t actively helping or engaged, who will step up and be that guy? Will it
be John Mueller, Pierre Far, or someone else? John and Pierre are greatly helpful, but will they go to bat for a niche that just got destroyed? Will they
push changes through so sites can turn around? And even at its most basic level, will they even be aware the problem exists?

These are all great questions, and I don’t want to bog down this post (it’s already incredibly long). But don’t laugh off Matt Cutts taking an extended
leave. If he’s gone for good, you might only realize how important he was to the SEO community
after he’s gone. And hopefully it’s not because
your site just tanked as collateral damage during an algorithm update. Matt might be
running a marathon or trying on new Halloween costumes. Then where will you be?


Recommendations moving forward:

So where does this leave us? How can you prepare for the approaching storm of crossing algorithms? Below, I have provided several key bullets that I think
every webmaster should consider. I recommend taking a hard look at your site
now, before major algos are running in near-real time.

  • Truly understand the weaknesses with your website. Google will continue crafting external algos that can be injected into the real-time algorithm.
    And they will go real-time at some point. Be ready by cleaning up your site now.
  • Document all changes and fluctuations the best you can. Use annotations in Google Analytics and keep a spreadsheet updated with detailed
    information.
  • Along the same lines, download your Google Webmaster Tools data monthly (at least). After helping many companies with algorithm hits, that
    information is incredibly valuable, and can help lead you down the right recovery path.
  • Use a mix of audits and focus groups to truly understand the quality of your site. I mentioned in my post about

    aggressive advertising and Panda

    that human focus groups are worth their weight in gold (for surfacing Panda-related problems). Most business owners are too close to their own content and
    websites to accurately measure quality. Bias can be a nasty problem and can quickly lead to bamboo-overflow on a website.
  • Beyond on-site analysis, make sure you tackle your link profile as well. I recommend heavily analyzing your inbound links and weeding out unnatural
    links. And use the disavow tool for links you can’t remove. The combination of enhancing the quality of your content, boosting engagement, knocking down
    usability obstacles, and cleaning up your link profile can help you achieve long-term SEO success. Don’t tackle one quarter of your SEO problems. Address
    all of them.
  • Remove barriers that inhibit change and action. You need to move fast. You need to be decisive. And you need to remove red tape that can bog down
    the cycle of getting changes implemented. Don’t water down your efforts because there are too many chefs in the kitchen. Understand the changes that need
    to be implemented, and take action. That’s how you win SEO-wise.


Summary: Are you ready for the approaching storm?

SEO is continually moving and evolving, and it’s important that webmasters adapt quickly. Over the past few years, Google’s brilliant object-oriented
approach to fighting webspam and low quality content has yielded algorithms like Panda, Penguin, Pirate, and Above the Fold. And more are on their way. My
advice is to get your situation in order now, before crossing algorithms blend a recipe of confusion that make it exponentially harder to identify, and
then fix, problems riddling your website.

Now excuse me while I try to build a flux capacitor. 🙂

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 5 years ago from feedproxy.google.com

What SEOs Need to Know About Topic Modeling & Semantic Connectivity – Whiteboard Friday

Posted by randfish

Search engines, especially Google, have gotten remarkably good at understanding searchers’ intent—what we
mean to search for, even if that’s not exactly what we search for. How in the world do they do this? It’s incredibly complex, but in today’s Whiteboard Friday, Rand covers the basics—what we all need to know about how entities are connected in search.

For reference, here’s a still of this week’s whiteboard!

Video Transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re talking topic modeling and semantic connectivity. Those words might sound big and confusing, but, in fact, they are important to understanding the operations of search engines, and they have some direct influence on things that we might do as SEOs, hence our need to understand them.

Now, I’m going to make a caveat here. I am not an expert in this topic. I have not taken the required math classes, stats classes, programming classes to truly understand this topic in a way that I would feel extremely comfortable explaining. However, even at the surface level of understanding, I feel like I can give some compelling information that hopefully you all and myself included can go research some more about. We’re certainly investigating a lot of topic modeling opportunities and possibilities here at Moz. We’ve done so in the past, and we’re revisiting that again for some future tools, so the topic is fresh on my mind.

So here’s the basic concept. The idea is that search engines are smarter than just knowing that a word, a phrase that someone searches for, like “Super Mario Brothers,” is only supposed to bring back results that have exactly the words “Super Mario Brothers,” that perfect phrase in the title and in the headline and in the document itself. That’s still an SEO best practice because you’re trying to serve visitors who have that search query. But search engines are actually a lot smarter than this.

One of my favorite examples is how intelligent Google has gotten around movie topics. So try, for example, searching for “That movie where the guy is called The Dude,” and you will see that Google properly returns “The Big Lebowski” in the first ranking position. How do they know that? Well, they’ve essentially connected up “movie,” “The Dude,” and said, “Aha, those things are most closely related to ‘The Big Lebowski. That’s what the intent of the searcher is. That’s the document that we’re going to return, not a document that happens to have ‘That movie about the guy named ‘The Dude’ in the title, exactly those words.'”

Here’s another example. So this is Super Mario Brothers, and Super Mario Brothers might be connected to a lot of other terms and phrases. So a search engine might understand that Super Mario Brothers is a little bit more semantically connected to Mario than it is to Luigi, then to Nintendo and then Bowser, the jumping dragon guy, turtle with spikes on his back — I’m not sure exactly what he is — and Princess Peach.

As you go down here, the search engine might actually have a topic modeling algorithm, something like latent semantic indexing, which was an early model, or a later model like latent Dirichlet allocation, which is a somewhat later model, or even predictive latent Dirichlet allocation, which is an even later model. Model’s not particularly important, especially for our purposes.

What is important is to know that there’s probably some scoring going on. A search engine — Google, Bing — can understand that some of these words are more connected to Super Mario Brothers than others, and it can do the reverse. They can say Super Mario Brothers is somewhat connected to video games and very not connected to cat food. So if we find a page that happens to have the title element of Super Mario Brothers, but most of the on-page content seems to be about cat food, well, maybe we shouldn’t rank that even if it has lots of incoming links with anchor text saying “Super Mario Brothers” or a very high page rank or domain authority or those kinds of things.

So search engines, Google, in particular, has gotten very, very smart about this connectivity stuff and this topic modeling post-Hummingbird. Hummingbird, of course, being the algorithm update from last fall that changed a lot of how they can interpret words and phrases.

So knowing that Google and Bing can calculate this relative connectivity, connectivity between the words and phrases and topics, we want to know how are they doing this. That answer is actually extremely broad. So that could come from co-occurrence in web documents. Sorry for turning my back on the camera. I know I’m supposed to move like this, but I just had to do a little twirl for you.

Distance between the keywords. I mean distance on the actual page itself. Does Google find “Super Mario Brothers” near the word “Mario” on a lot of the documents where the two occur, or are they relatively far away? Maybe Super Mario Brothers does appear with cat food a lot, but they’re quite far away. They might look at citations and links between documents in terms of, boy, there’s a lot pages on the web, when they talk about Super Mario Brothers, they also link to pages about Mario, Luigi, Nintendo, etc.

They can look at the anchor text connections of those links. They could look at co-occurrence of those words biased by a given corpi, a set of corpuses, or from certain domains. So they might say, “Hey, we only want to pay attention to what’s on the fresh web right now or in the blogosphere or on news sites or on trusted domains, these kinds of things as opposed to looking at all of the documents on the web.” They might choose to do this in multiple different sets of corpi.

They can look at queries from searchers, which is a really powerful thing that we unfortunately don’t have access to. So they might see searcher behavior saying that a lot of people who search for Mario, Luigi, Nintendo are also searching for Super Mario Brothers.

They might look at searcher clicks, visits, history, all of that browser data that they’ve got from Chrome and from Android and, of course, from Google itself, and they might say those are corpi that they use to connect up words and phrases.

Probably there’s a whole list of other places that they’re getting this from. So they can build a very robust data set to connect words and phrases. For us, as SEOs, this means a few things.

If you’re targeting a keyword for rankings, say “Super Mario Brothers,” those semantically connected and related terms and phrases can help with a number of things. So if you could know that these were the right words and phrases that search engines connected to Super Mario Brothers, you can do all sorts of stuff. Things like inclusion on the page itself, helping to tell the search engine my page is more relevant for Super Mario Brothers because I include words like Mario, Luigi, Princess Peach, Bowser, Nintendo, etc. as opposed to things like cat food, dog food, T-shirts, glasses, what have you.

You can think about it in the links that you earn, the documents that are linking to you and whether they contain those words and phrases and are on those topics, the anchor text that points to you potentially. You can certainly be thinking about this from a naming convention and branding standpoint. So if you’re going to call a product something or call a page something or your unique version of it, you might think about including more of these words or biasing to have those words in the description of the product itself, the formal product description.

For an About page, you might think about the formal bio for a person or a company, including those kinds of words, so that as you’re getting cited around the web or on your book cover jacket or in the presentation that you give at a conference, those words are included. They don’t necessarily have to be links. This is a potentially powerful thing to say a lot of people who mention Super Mario Brothers tend to point to this page Nintendo8.com, which I think actually you can play the original “Super Mario Brothers” live on the web. It’s kind of fun. Sorry to waste your afternoon with that.

Of course, these can also be additional keywords that you might consider targeting. This can be part of your keyword research in addition to your on-page and link building optimization.

What’s unfortunate is right now there are not a lot of tools out there to help you with this process. There is a tool from Virante. Russ Jones, I think did some funding internally to put this together, and it’s quite cool. It’s 
nTopic.org. Hopefully, this Whiteboard Friday won’t bring that tool to its knees by sending tons of traffic over there. But if it does, maybe give it a few days and come back. It gives you a broad score with a little more data if you register and log in. It’s got a plugin for Chrome and for WordPress. It’s fairly simplistic right now, but it might help you say, “Is this page on the topic of the term or phrase that I’m targeting?”

There are many, many downloadable tools and libraries. In fact, Code.google.com has an LDA topic modeling tool specifically, and that might have been something that Google used back in the day. We don’t know.

If you do a search for topic modeling tools, you can find these. Unfortunately, almost all of them are going to require some web development background at the very least. Many of them rely on a Python library or an API. Almost all of them also require a training corpus in order to model things on. So you can think about, “Well, maybe I can download Wikipedia’s content and use that as a training model or use the top 10 search results from Google as some sort of training model.”

This is tough stuff. This is one of the reasons why at Moz I’m particularly passionate about trying to make this something that we can help with in our on-page optimization and keyword difficulty tools, because I think this can be very powerful stuff.

What is true is that you can spot check this yourself right now. It is very possible to go look at things like related searches, look at the keyword terms and phrases that also appear on the pages that are ranking in the top 10 and extract these things out and use your own mental intelligence to say, “Are these terms and phrases relevant? Should they be included? Are these things that people would be looking for? Are they topically relevant?” Consider including them and using them for all of these things. Hopefully, over time, we’ll get more sophisticated in the SEO world with tools that can help with this.

All right, everyone, hope you’ve enjoyed this addition of Whiteboard Friday. Look forward to some great comments, and we’ll see you again next week. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 5 years ago from feedproxy.google.com