Becoming Better SEO Scientists – Whiteboard Friday

Posted by MarkTraphagen

Editor’s note: Today we’re featuring back-to-back episodes of Whiteboard Friday from our friends at Stone Temple Consulting. Make sure to also check out the second episode, “UX, Content Quality, and SEO” from Eric Enge.

Like many other areas of marketing, SEO incorporates elements of science. It becomes problematic for everyone, though, when theories that haven’t been the subject of real scientific rigor are passed off as proven facts. In today’s Whiteboard Friday, Stone Temple Consulting’s Mark Traphagen is here to teach us a thing or two about the scientific method and how it can be applied to our day-to-day work.

For reference, here’s a still of this week’s whiteboard.
Click on it to open a high resolution image in a new tab!

Video transcription

Howdy, Mozzers. Mark Traphagen from Stone Temple Consulting here today to share with you how to become a better SEO scientist. We know that SEO is a science in a lot of ways, and everything I’m going to say today applies not only to SEO, but testing things like your AdWords, how does that work, quality scores. There’s a lot of different applications you can make in marketing, but we’ll focus on the SEO world because that’s where we do a lot of testing. What I want to talk to you about today is how that really is a science and how we need to bring better science in it to get better results.

The reason is in astrophysics, things like that we know there’s something that they’re talking about these days called dark matter, and dark matter is something that we know it’s there. It’s pretty much accepted that it’s there. We can’t see it. We can’t measure it directly. We don’t even know what it is. We can’t even imagine what it is yet, and yet we know it’s there because we see its effect on things like gravity and mass. Its effects are everywhere. And that’s a lot like search engines, isn’t it? It’s like Google or Bing. We see the effects, but we don’t see inside the machine. We don’t know exactly what’s happening in there.

An artist’s depiction of how search engines work.

So what do we do? We do experiments. We do tests to try to figure that out, to see the effects, and from the effects outside we can make better guesses about what’s going on inside and do a better job of giving those search engines what they need to connect us with our customers and prospects. That’s the goal in the end.

Now, the problem is there’s a lot of testing going on out there, a lot of experiments that maybe aren’t being run very well. They’re not being run according to scientific principles that have been proven over centuries to get the best possible results.

Basic data science in 10 steps

So today I want to give you just very quickly 10 basic things that a real scientist goes through on their way to trying to give you better data. Let’s see what we can do with those in our SEO testing in the future.

So let’s start with number one. You’ve got to start with a hypothesis. Your hypothesis is the question that you want to solve. You always start with that, a good question in mind, and it’s got to be relatively narrow. You’ve got to narrow it down to something very specific. Something like how does time on page effect rankings, that’s pretty narrow. That’s very specific. That’s a good question. Might be able to test that. But something like how do social signals effect rankings, that’s too broad. You’ve got to narrow it down. Get it down to one simple question.

Then you choose a variable that you’re going to test. Out of all the things that you could do, that you could play with or you could tweak, you should choose one thing or at least a very few things that you’re going to tweak and say, “When we tweak this, when we change this, when we do this one thing, what happens? Does it change anything out there in the world that we are looking at?” That’s the variable.

The next step is to set a sample group. Where are you going to gather the data from? Where is it going to come from? That’s the world that you’re working in here. Out of all the possible data that’s out there, where are you going to gather your data and how much? That’s the small circle within the big circle. Now even though it’s smaller, you’re probably not going to get all the data in the world. You’re not going to scrape every search ranking that’s possible or visit every URL.

You’ve got to ask yourself, “Is it large enough that we’re at least going to get some validity?” If I wanted to find out what is the typical person in Seattle and I might walk through just one part of the Moz offices here, I’d get some kind of view. But is that a typical, average person from Seattle? I’ve been around here at Moz. Probably not. But this was large enough.

Also, it should be randomized as much as possible. Again, going back to that example, if I just stayed here within the walls of Moz and do research about Mozzers, I’d learn a lot about what Mozzers do, what Mozzers think, how they behave. But that may or may not be applicable to the larger world outside, so you randomized.

We want to control. So we’ve got our sample group. If possible, it’s always good to have another sample group that you don’t do anything to. You do not manipulate the variable in that group. Now, why do you have that? You have that so that you can say, to some extent, if we saw a change when we manipulated our variable and we did not see it in the control group, the same thing didn’t happen, more likely it’s not just part of the natural things that happen in the world or in the search engine.

If possible, even better you want to make that what scientists call double blind, which means that even you the experimenter don’t know who that control group is out of all the SERPs that you’re looking at or whatever it is. As careful as you might be and honest as you might be, you can end up manipulating the results if you know who is who within the test group? It’s not going to apply to every test that we do in SEO, but a good thing to have in mind as you work on that.

Next, very quickly, duration. How long does it have to be? Is there sufficient time? If you’re just testing like if I share a URL to Google +, how quickly does it get indexed in the SERPs, you might only need a day on that because typically it takes less than a day in that case. But if you’re looking at seasonality effects, you might need to go over several years to get a good test on that.

Let’s move to the second group here. The sixth thing keep a clean lab. Now what that means is try as much as possible to keep anything that might be dirtying your results, any kind of variables creeping in that you didn’t want to have in the test. Hard to do, especially in what we’re testing, but do the best you can to keep out the dirt.

Manipulate only one variable. Out of all the things that you could tweak or change choose one thing or a very small set of things. That will give more accuracy to your test. The more variables that you change, the more other effects and inner effects that are going to happen that you may not be accounting for and are going to muddy your results.

Make sure you have statistical validity when you go to analyze those results. Now that’s beyond the scope of this little talk, but you can read up on that. Or even better, if you are able to, hire somebody or work with somebody who is a trained data scientist or has training in statistics so they can look at your evaluation and say the correlations or whatever you’re seeing, “Does it have a statistical significance?” Very important.

Transparency. As much as possible, share with the world your data set, your full results, your methodology. What did you do? How did you set up the study? That’s going to be important to our last step here, which is replication and falsification, one of the most important parts of any scientific process.

So what you want to invite is, hey we did this study. We did this test. Here’s what we found. Here’s how we did it. Here’s the data. If other people ask the same question again and run the same kind of test, do they get the same results? Somebody runs it again, do they get the same results? Even better, if you have some people out there who say, “I don’t think you’re right about that because I think you missed this, and I’m going to throw this in and see what happens,” aha they falsify. That might make you feel like you failed, but it’s success because in the end what are we after? We’re after the truth about what really works.

Think about your next test, your next experiment that you do. How can you apply these 10 principles to do better testing, get better results, and have better marketing? Thanks.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Misuses of 4 Google Analytics Metrics Debunked

Posted by Tom.Capper

In this post I’ll pull apart four of the most commonly used metrics in Google Analytics, how they are collected, and why they are so easily misinterpreted.

Average Time on Page

Average time on page should be a really useful metric, particularly if you’re interested in engagement with content that’s all on a single page. Unfortunately, this is actually its worst use case. To understand why, you need to understand how time on page is calculated in Google Analytics:

Time on Page: Total across all pageviews of time from pageview to last engagement hit on that page (where an engagement hit is any of: next pageview, interactive event, e-commerce transaction, e-commerce item hit, or social plugin). (Source)

If there is no subsequent engagement hit, or if there is a gap between the last engagement hit on a site and leaving the site, the assumption is that no further time was spent on the site. Below are some scenarios with an intuitive time on page of 20 seconds, and their Google Analytics time on page:

Scenario

Intuitive time on page

GA time on page

0s: Pageview
10s: Social plugin
20s: Click through to next page

20s

20s

0s: Pageview
10s: Social plugin
20s: Leave site

20s

10s

0s: Pageview
20s: Leave site

20s

0s

Google doesn’t want exits to influence the average time on page, because of scenarios like the third example above, where they have a time on page of 0 seconds (source). To avoid this, they use the following formula (remember that Time on Page is a total):

Average Time on Page: (Time on Page) / (Pageviews – Exits)

However, as the second example above shows, this assumption doesn’t always hold. The second example feeds into the top half of the average time on page faction, but not the bottom half:

Example 2 Average Time on Page: (20s+10s+0s) / (3-2) = 30s

There are two issues here:

  1. Overestimation
    Excluding exits from the second half of the average time on page equation doesn’t have the desired effect when their time on page wasn’t 0 seconds—note that 30s is longer than any of the individual visits. This is why average time on page can often be longer than average visit duration. Nonetheless, 30 seconds doesn’t seem too far out in the above scenario (the intuitive average is 20s), but in the real world many pages have much higher exit rates than the 67% in this example, and/or much less engagement with events on page.
  2. Ignored visits
    Considering only visitors who exit without an engagement hit, whether these visitors stayed for 2 seconds, 10 minutes or anything inbetween, it doesn’t influence average time on page in the slightest. On many sites, a 10 minute view of a single page without interaction (e.g. a blog post) would be considered a success, but it wouldn’t influence this metric.

Solution: Unfortunately, there isn’t an easy solution to this issue. If you want to use average time on page, you just need to keep in mind how it’s calculated. You could also consider setting up more engagement events on page (like a scroll event without the “nonInteraction” parameter)—this solves issue #2 above, but potentially worsens issue #1.

Site Speed

If you’ve used the Site Speed reports in Google Analytics in the past, you’ve probably noticed that the numbers can sometimes be pretty difficult to believe. This is because the way that Site Speed is tracked is extremely vulnerable to outliers—it starts with a 1% sample of your users and then takes a simple average for each metric. This means that a few extreme values (for example, the occasional user with a malware-infested computer or a questionable wifi connection) can create a very large swing in your data.

The use of an average as a metric is not in itself bad, but in an area so prone to outliers and working with such a small sample, it can lead to questionable results.

Fortunately, you can increase the sampling rate right up to 100% (or the cap of 10,000 hits per day). Depending on the size of your site, this may still only be useful for top-level data. For example, if your site gets 1,000,000 hits per day and you’re interested in the performance of a new page that’s receiving 100 hits per day, Google Analytics will throttle your sampling back to the 10,000 hits per day cap—1%. As such, you’ll only be looking at a sample of 1 hit per day for that page.

Solution: Turn up the sampling rate. If you receive more than 10,000 hits per day, keep the sampling rate in mind when digging into less visited pages. You could also consider external tools and testing, such as Pingdom or WebPagetest.

Conversion Rate (by channel)

Obviously, conversion rate is not in itself a bad metric, but it can be rather misleading in certain reports if you don’t realise that, by default, conversions are attributed using a last non-direct click attribution model.

From Google Analytics Help:

“…if a person clicks over your site from google.com, then returns as “direct” traffic to convert, Google Analytics will report 1 conversion for “google.com / organic” in All Traffic.”

This means that when you’re looking at conversion numbers in your acquisition reports, it’s quite possible that every single number is different to what you’d expect under last click—every channel other than direct has a total that includes some conversions that occurred during direct sessions, and direct itself has conversion numbers that don’t include some conversions that occurred during direct sessions.

Solution: This is just something to be aware of. If you do want to know your last-click numbers, there’s always the Multi-Channel Funnels and Attribution reports to help you out.

Exit Rate

Unlike some of the other metrics I’ve discussed here, the calculation behind exit rate is very intuitive—”for all pageviews to the page, Exit Rate is the percentage that were the last in the session.” The problem with exit rate is that it’s so often used as a negative metric: “Which pages had the highest exit rate? They’re the problem with our site!” Sometimes this might be true: Perhaps, for example, if those pages are in the middle of a checkout funnel.

Often, however, a user will exit a site when they’ve found what they want. This doesn’t just mean that a high exit rate is ok on informational pages like blog posts or about pages—it could also be true of product pages and other pages with a highly conversion-focused intent. Even on ecommerce sites, not every visitor has the intention of converting. They might be researching towards a later online purchase, or even planning to visit your physical store. This is particularly true if your site ranks well for long tail queries or is referenced elsewhere. In this case, an exit could be a sign that they found the information they wanted and are ready to purchase once they have the money, the need, the right device at hand or next time they’re passing by your shop.

Solution: When judging a page by its exit rate, think about the various possible user intents. It could be useful to take a segment of visitors who exited on a certain page (in the Advanced tab of the new segment menu), and investigate their journey in User Flow reports, or their landing page and acquisition data.

Discussion

If you know of any other similarly misunderstood metrics, you have any questions or you have something to add to my analysis, tweet me at @THCapper or leave a comment below.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Long Tail CTR Study: The Forgotten Traffic Beyond Top 10 Rankings

Posted by GaryMoyle

Search behavior is fundamentally changing, as users become more savvy and increasingly familiar with search technology. Google’s results have also changed significantly over the last decade, going from a simple page of 10 blue links to a much richer layout, including videos, images, shopping ads and the innovative Knowledge Graph.

We also know there are an increasing amount of touchpoints in a customer journey involving different channels and devices. Google’s
Zero Moment of Truth theory (ZMOT), which describes a revolution in the way consumers search for information online, supports this idea and predicts that we can expect the number of times natural search is involved on the path to a conversion to get higher and higher.

Understanding how people interact with Google and other search engines will always be important. Organic click curves show how many clicks you might expect from search engine results and are one way of evaluating the impact of our campaigns, forecasting performance and exploring changing search behavior.

Using search query data from Google UK for a wide range of leading brands based on millions of impressions and clicks, we can gain insights into the how CTR in natural search has evolved beyond those shown in previous studies by
Catalyst, Slingshot and AOL.

Our methodology

The NetBooster study is based entirely on UK top search query data and has been refined by day in order to give us the most accurate sample size possible. This helped us reduce anomalies in the data in order to achieve the most reliable click curve possible, allowing us to extend it way beyond the traditional top 10 results.

We developed a method to extract data day by day to greatly increase the volume of keywords and to help improve the accuracy of the
average ranking position. It ensured that the average was taken across the shortest timescale possible, reducing rounding errors.

The NetBooster study included:

  • 65,446,308 (65 million) clicks
  • 311,278,379 (311 million) impressions
  • 1,253,130 (1.2 million) unique search queries
  • 54 unique brands
  • 11 household brands (sites with a total of 1M+ branded keyword impressions)
  • Data covers several verticals including retail, travel and financial

We also looked at organic CTR for mobile, video and image results to better understand how people are discovering content in natural search across multiple devices and channels. 

We’ll explore some of the most important elements in this article.

How does our study compare against others?

Let’s start by looking at the top 10 results. In the graph below we have normalized the results in order to compare our curve, like-for-like, with previous studies from Catalyst and Slingshot. Straight away we can see that there is higher participation beyond the top four positions when compared to other studies. We can also see much higher CTR for positions lower on the pages, which highlights how searchers are becoming more comfortable with mining search results.

A new click curve to rule them all

Our first click curve is the most useful, as it provides the click through rates for generic non-brand search queries across positions 1 to 30. Initially, we can see a significant amount of traffic going to the top three results with position No. 1 receiving 19% of total traffic, 15% at position No. 2 and 11.45% at position No. 3. The interesting thing to note, however, is our curve shows a relatively high CTR for positions typically below the fold. Positions 6-10 all received a higher CTR than shown in previous studies. It also demonstrates that searchers are frequently exploring pages two and three.

CTR-top-30-730px.jpg

When we look beyond the top 10, we can see that CTR is also higher than anticipated, with positions 11-20 accounting for 17% of total traffic. Positions 21-30 also show higher than anticipated results, with over 5% of total traffic coming from page three. This gives us a better understanding of the potential uplift in visits when improving rankings from positions 11-30.

This highlights that searchers are frequently going beyond the top 10 to find the exact result they want. The prominence of paid advertising, shopping ads, Knowledge Graph and the OneBox may also be pushing users below the fold more often as users attempt to find better qualified results. It may also indicate growing dissatisfaction with Google results, although this is a little harder to quantify.

Of course, it’s important we don’t just rely on one single click curve. Not all searches are equal. What about the influence of brand, mobile and long-tail searches?

Brand bias has a significant influence on CTR

One thing we particularly wanted to explore was how the size of your brand influences the curve. To explore this, we banded each of the domains in our study into small, medium and large categories based on the sum of brand query impressions across the entire duration of the study.

small-medium-large-brand-organic-ctr-730

When we look at how brand bias is influencing CTR for non-branded search queries, we can see that better known brands get a sizable increase in CTR. More importantly, small- to medium-size brands are actually losing out to results from these better-known brands and experience a much lower CTR in comparison.

What is clear is keyphrase strategy will be important for smaller brands in order to gain traction in natural search. Identifying and targeting valuable search queries that aren’t already dominated by major brands will minimize the cannibalization of CTR and ensure higher traffic levels as a result.

How does mobile CTR reflect changing search behavior?

Mobile search has become a huge part of our daily lives, and our clients are seeing a substantial shift in natural search traffic from desktop to mobile devices. According to Google, 30% of all searches made in 2013 were on a mobile device; they also predict mobile searches will constitute over 50% of all searches in 2014.

Understanding CTR from mobile devices will be vital as the mobile search revolution continues. It was interesting to see that the click curve remained very similar to our desktop curve. Despite the lack of screen real estate, searchers are clearly motivated to scroll below the fold and beyond the top 10.

netbooster-mobile-organic-ctr-730px.jpg

NetBooster CTR curves for top 30 organic positions


Position

Desktop CTR

Mobile CTR

Large Brand

Medium Brand

Small Brand
1 19.35% 20.28% 20.84% 13.32% 8.59%
2 15.09% 16.59% 16.25% 9.77% 8.92%
3 11.45% 13.36% 12.61% 7.64% 7.17%
4 8.68% 10.70% 9.91% 5.50% 6.19%
5 7.21% 7.97% 8.08% 4.69% 5.37%
6 5.85% 6.38% 6.55% 4.07% 4.17%
7 4.63% 4.85% 5.20% 3.33% 3.70%
8 3.93% 3.90% 4.40% 2.96% 3.22%
9 3.35% 3.15% 3.76% 2.62% 3.05%
10 2.82% 2.59% 3.13% 2.25% 2.82%
11 3.06% 3.18% 3.59% 2.72% 1.94%
12 2.36% 3.62% 2.93% 1.96% 1.31%
13 2.16% 4.13% 2.78% 1.96% 1.26%
14 1.87% 3.37% 2.52% 1.68% 0.92%
15 1.79% 3.26% 2.43% 1.51% 1.04%
16 1.52% 2.68% 2.02% 1.26% 0.89%
17 1.30% 2.79% 1.67% 1.20% 0.71%
18 1.26% 2.13% 1.59% 1.16% 0.86%
19 1.16% 1.80% 1.43% 1.12% 0.82%
20 1.05% 1.51% 1.36% 0.86% 0.73%
21 0.86% 2.04% 1.15% 0.74% 0.70%
22 0.75% 2.25% 1.02% 0.68% 0.46%
23 0.68% 2.13% 0.91% 0.62% 0.42%
24 0.63% 1.84% 0.81% 0.63% 0.45%
25 0.56% 2.05% 0.71% 0.61% 0.35%
26 0.51% 1.85% 0.59% 0.63% 0.34%
27 0.49% 1.08% 0.74% 0.42% 0.24%
28 0.45% 1.55% 0.58% 0.49% 0.24%
29 0.44% 1.07% 0.51% 0.53% 0.28%
30 0.36% 1.21% 0.47% 0.38% 0.26%

Creating your own click curve

This study will give you a set of benchmarks for both non-branded and branded click-through rates with which you can confidently compare to your own click curve data. Using this data as a comparison will let you understand whether the appearance of your content is working for or against you.

We have made things a little easier for you by creating an Excel spreadsheet: simply drop your own top search query data in and it’ll automatically create a click curve for your website.

Simply visit the NetBooster website and download our tool to start making your own click curve.

In conclusion

It’s been both a fascinating and rewarding study, and we can clearly see a change in search habits. Whatever the reasons for this evolving search behavior, we need to start thinking beyond the top 10, as pages two and three are likely to get more traffic in future. 

 We also need to maximize the traffic created from existing rankings and not just think about position.

Most importantly, we can see practical applications of this data for anyone looking to understand and maximize their content’s performance in natural search. Having the ability to quickly and easily create your own click curve and compare this against a set of benchmarks means you can now understand whether you have an optimal CTR.

What could be the next steps?

There is, however, plenty of scope for improvement. We are looking forward to continuing our investigation, tracking the evolution of search behavior. If you’d like to explore this subject further, here are a few ideas:

  • Segment search queries by intent (How does CTR vary depending on whether a search query is commercial or informational?)
  • Understand CTR by industry or niche
  • Monitor the effect of new Knowledge Graph formats on CTR across both desktop and mobile search
  • Conduct an annual analysis of search behavior (Are people’s search habits changing? Are they clicking on more results? Are they mining further into Google’s results?)

Ultimately, click curves like this will change as the underlying search behavior continues to evolve. We are now seeing a massive shift in the underlying search technology, with Google in particular heavily investing in entity- based search (i.e., the Knowledge Graph). We can expect other search engines, such as Bing, Yandex and Baidu to follow suit and use a similar approach.

The rise of smartphone adoption and constant connectivity also means natural search is becoming more focused on mobile devices. Voice-activated search is also a game-changer, as people start to converse with search engines in a more natural way. This has huge implications for how we monitor search activity.

What is clear is no other industry is changing as rapidly as search. Understanding how we all interact with new forms of search results will be a crucial part of measuring and creating success.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 5 years ago from feedproxy.google.com

Majestic is heading to Chicago

    We have recently rebranded and will be visiting Click Z Chicago as Majestic. We will be there the whole duration of the conference, starting on Monday 3rd until Thursday 6th November. Come visit us at our stand, be the first to check out our new look and tell us what you think!

The post Majestic is heading to Chicago appeared first on Majestic Blog.

Reblogged 5 years ago from blog.majestic.com

Experiment: We Removed a Major Website from Google Search, for Science!

Posted by Cyrus-Shepard

The folks at Groupon surprised us earlier this summer when they reported the
results of an experiment that showed that up to 60% of direct traffic is organic.

In order to accomplish this, Groupon de-indexed their site, effectively removing themselves from Google search results. That’s crazy talk!

Of course, we knew we had to try this ourselves.

We rolled up our sleeves and chose to de-index
Followerwonk, both for its consistent Google traffic and its good analytics setup—that way we could properly measure everything. We were also confident we could quickly bring the site back into Google’s results, which minimized the business risks.

(We discussed de-indexing our main site moz.com, but… no soup for you!)

We wanted to measure and test several things:

  1. How quickly will Google remove a site from its index?
  2. How much of our organic traffic is actually attributed as direct traffic?
  3. How quickly can you bring a site back into search results using the URL removal tool?

Here’s what happened.

How to completely remove a site from Google

The fastest, simplest, and most direct method to completely remove an entire site from Google search results is by using the
URL removal tool

We also understood, via statements form Google engineers, that using this method gave us the biggest chance of bringing the site back, with little risk. Other methods of de-indexing, such as using meta robots NOINDEX, might have taken weeks and caused recovery to take months.

CAUTION: Removing any URLs from a search index is potentially very dangerous, and should be taken very seriously. Do not try this at home; you will not pass go, and will not collect $200!

CAUTION: Removing any URLs from a search index is potentially very dangerous, and should be taken very seriously. Do not try this at home; you will not pass go, and will not collect $200!

After submitting the request, Followerwonk URLs started
disappearing from Google search results in 2-3 hours

The information needs to propagate across different data centers across the globe, so the effect can be delayed in areas. In fact, for the entire duration of the test, organic Google traffic continued to trickle in and never dropped to zero.

The effect on direct vs. organic traffic

In the Groupon experiment, they found that when they lost organic traffic, they
actually lost a bunch of direct traffic as well. The Groupon conclusion was that a large amount of their direct traffic was actually organic—up to 60% on “long URLs”.

At first glance, the overall amount of direct traffic to Followerwonk didn’t change significantly, even when organic traffic dropped.

In fact, we could find no discrepancy in direct traffic outside the expected range.

I ran this by our contacts at Groupon, who said this wasn’t totally unexpected. You see, in their experiment they saw the biggest drop in direct traffic on
long URLs, defined as a URL that is at least as long enough to be in a subfolder, like https://followerwonk.com/bio/?q=content+marketer.

For Followerwonk, the vast majority of traffic goes to the homepage and a handful of other URLs. This means we didn’t have a statistically significant sample size of long URLs to judge the effect. For the long URLs we were able to measure, the results were nebulous. 

Conclusion: While we can’t confirm the Groupon results with our outcome, we can’t discount them either.

It’s quite likely that a portion of your organic traffic is attributed as direct. This is because of different browsers, operating systems and user privacy settings can potentially block referral information from reaching your website.

Bringing your site back from death

After waiting 2 hours,
we deleted the request. Within a few hours all traffic returned to normal. Whew!

Does Google need to recrawl the pages?

If the time period is short enough, and you used the URL removal tool, apparently not.

In the case of Followerwonk, Google removed over
300,000 URLs from its search results, and made them all reappear in mere hours. This suggests that the domain wasn’t completely removed from Google’s index, but only “masked” from appearing for a short period of time.

What about longer periods of de-indexation?

In both the Groupon and Followerwonk experiments, the sites were only de-indexed for a short period of time, and bounced back quickly.

We wanted to find out what would happen if you de-indexed a site for a longer period, like
two and a half days?

I couldn’t convince the team to remove any of our sites from Google search results for a few days, so I choose a smaller personal site that I often subject to merciless SEO experiments.

In this case, I de-indexed the site and didn’t remove the request until three days later. Even with this longer period, all URLs returned within just
a few hours of cancelling the URL removal request.

In the chart below, we revoked the URL removal request on Friday the 25th. The next two days were Saturday and Sunday, both lower traffic days.

Test #2: De-index a personal site for 3 days

Likely, the URLs were still in Google’s index, so we didn’t have to wait for them to be recrawled. 

Here’s another shot of organic traffic before and after the second experiment.

For longer removal periods, a few weeks for example, I speculate Google might drop these semi-permanently from the index and re-inclusion would comprise a much longer time period.

What we learned

  1. While a portion of your organic traffic may be attributed as direct (due to browsers, privacy settings, etc) in our case the effect on direct traffic was negligible.
  2. If you accidentally de-index your site using Google Webmaster Tools, in most cases you can quickly bring it back to life by deleting the request.
  3. Reinclusion happens quickly even after we removed a site for over 2 days. Longer than this, the result is unknown, and you could have problems getting all the pages of your site indexed again.

Further reading

Moz community member Adina Toma wrote an excellent YouMoz post on the re-inclusion process using the same technique, with some excellent tips for other, more extreme situations.

Big thanks to
Peter Bray for volunteering Followerwonk for testing. You are a brave man!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 5 years ago from feedproxy.google.com