Speaker announcement for dotdigital Summit 2019

Let’s delve into this year’s first speaker for the dotdigital Summit 2019.

About Tim

Tim is a senior columnist for the Financial Times. His long-running column, ‘The Undercover Economist’, reveals the economic ideas behind everyday experiences. He also writes op-eds, interviews, and long feature articles for the daily newspaper. He’s an evangelist for the power of economics and has spoken at TED, PopTech, and Sydney Opera House.

He is author of ‘Fifty Things That Made the Modern Economy’, ‘Messy’, and the million-selling ‘The Undercover Economist’. As well as a senior columnist at the Financial Times, Tim is the presenter of Radio 4’s ‘More or Less’ and the iTunes-topping series, ‘Fifty Things That Made the Modern Economy’.

 

 

The keynote

During Tim’s presentation he will be investigating how certain kinds of complexities and obstacles can actually improve our performance. He will be covering the topics of cognitive psychology, complexity science, social psychology, and, of course, throwing ‘Rock n Roll’ into the mix for good measure.

Can disruption, mess, and crazy moves actually help solve some of our most complex problems? Tim examines this and talks about how we shouldn’t be scared of complexity, as it can in fact make us more creative! Just because you don’t like it doesn’t mean it isn’t helping you!

Don’t miss out!

Join us on the 20th March 2019: don’t miss out on this opportunity to hear a great speaker with exceptional insight into the complexities of everyday experiences, and how we can thrive in them!

Register for the Summit here.

The post Speaker announcement for dotdigital Summit 2019 appeared first on The Marketing Automation Blog.

Reblogged 2 months ago from blog.dotmailer.com

We have an announcement!

Last night, at the dotties 2018, we unveiled our new brand identity to a glittering room of clients, partners and employees. We’re thrilled to be able to share our rationale with you today.

What’s new?

We’ve brought our brand up to speed with a new name and look.  From 16th January 2019, we’ll be known as dotdigital. This is not a drastic shift in direction, but to us it signifies our strong commitment to the future of our business and the products we offer our customers.

We’re also changing the name of our platform to Engagement Cloud. In the 20 years we’ve been empowering you, our technology has evolved to meet the ever-advancing needs of your customers. It was becoming clear we needed a name that was more accurate to how you’re using the platform today. Engagement Cloud encompasses all of the same top-notch technology, intelligence and service you rely on to give your customers truly memorable experiences.

 

What inspired us?

You did! Since 1999, we’ve been evolving our platform to empower your customer engagement. We’ve seen the rise in consumer expectation – and we’ve watched our customers blaze a trail of excellence with data-driven marketing automation. We know that, today, email stands as one component of a much bigger, holistic engagement strategy.

We’ve been always been forward-facing, and for us the future brings further evolution, inspiration and empowerment. Our new brand identity is part of this vision.

 

Want to find out more?

Take a closer look into the future of dotdigital »

The post We have an announcement! appeared first on The Marketing Automation Blog.

Reblogged 4 months ago from blog.dotmailer.com

How to Use Server Log Analysis for Technical SEO

Posted by SamuelScott

It’s ten o’clock. Do you know where your logs are?

I’m introducing this guide with a pun on a common public-service announcement that has run on late-night TV news broadcasts in the United States because log analysis is something that is extremely newsworthy and important.

If your technical and on-page SEO is poor, then nothing else that you do will matter. Technical SEO is the key to helping search engines to crawl, parse, and index websites, and thereby rank them appropriately long before any marketing work begins.

The important thing to remember: Your log files contain the only data that is 100% accurate in terms of how search engines are crawling your website. By helping Google to do its job, you will set the stage for your future SEO work and make your job easier. Log analysis is one facet of technical SEO, and correcting the problems found in your logs will help to lead to higher rankings, more traffic, and more conversions and sales.

Here are just a few reasons why:

  • Too many response code errors may cause Google to reduce its crawling of your website and perhaps even your rankings.
  • You want to make sure that search engines are crawling everything, new and old, that you want to appear and rank in the SERPs (and nothing else).
  • It’s crucial to ensure that all URL redirections will pass along any incoming “link juice.”

However, log analysis is something that is unfortunately discussed all too rarely in SEO circles. So, here, I wanted to give the Moz community an introductory guide to log analytics that I hope will help. If you have any questions, feel free to ask in the comments!

What is a log file?

Computer servers, operating systems, network devices, and computer applications automatically generate something called a log entry whenever they perform an action. In a SEO and digital marketing context, one type of action is whenever a page is requested by a visiting bot or human.

Server log entries are specifically programmed to be output in the Common Log Format of the W3C consortium. Here is one example from Wikipedia with my accompanying explanations:

127.0.0.1 user-identifier frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
  • 127.0.0.1 — The remote hostname. An IP address is shown, like in this example, whenever the DNS hostname is not available or DNSLookup is turned off.
  • user-identifier — The remote logname / RFC 1413 identity of the user. (It’s not that important.)
  • frank — The user ID of the person requesting the page. Based on what I see in my Moz profile, Moz’s log entries would probably show either “SamuelScott” or “392388” whenever I visit a page after having logged in.
  • [10/Oct/2000:13:55:36 -0700] — The date, time, and timezone of the action in question in strftime format.
  • GET /apache_pb.gif HTTP/1.0 — “GET” is one of the two commands (the other is “POST”) that can be performed. “GET” fetches a URL while “POST” is submitting something (such as a forum comment). The second part is the URL that is being accessed, and the last part is the version of HTTP that is being accessed.
  • 200 — The status code of the document that was returned.
  • 2326 — The size, in bytes, of the document that was returned.

Note: A hyphen is shown in a field when that information is unavailable.

Every single time that you — or the Googlebot — visit a page on a website, a line with this information is output, recorded, and stored by the server.

Log entries are generated continuously and anywhere from several to thousands can be created every second — depending on the level of a given server, network, or application’s activity. A collection of log entries is called a log file (or often in slang, “the log” or “the logs”), and it is displayed with the most-recent log entry at the bottom. Individual log files often contain a calendar day’s worth of log entries.

Accessing your log files

Different types of servers store and manage their log files differently. Here are the general guides to finding and managing log data on three of the most-popular types of servers:

What is log analysis?

Log analysis (or log analytics) is the process of going through log files to learn something from the data. Some common reasons include:

  • Development and quality assurance (QA) — Creating a program or application and checking for problematic bugs to make sure that it functions properly
  • Network troubleshooting — Responding to and fixing system errors in a network
  • Customer service — Determining what happened when a customer had a problem with a technical product
  • Security issues — Investigating incidents of hacking and other intrusions
  • Compliance matters — Gathering information in response to corporate or government policies
  • Technical SEO — This is my favorite! More on that in a bit.

Log analysis is rarely performed regularly. Usually, people go into log files only in response to something — a bug, a hack, a subpoena, an error, or a malfunction. It’s not something that anyone wants to do on an ongoing basis.

Why? This is a screenshot of ours of just a very small part of an original (unstructured) log file:

Ouch. If a website gets 10,000 visitors who each go to ten pages per day, then the server will create a log file every day that will consist of 100,000 log entries. No one has the time to go through all of that manually.

How to do log analysis

There are three general ways to make log analysis easier in SEO or any other context:

  • Do-it-yourself in Excel
  • Proprietary software such as Splunk or Sumo-logic
  • The ELK Stack open-source software

Tim Resnik’s Moz essay from a few years ago walks you through the process of exporting a batch of log files into Excel. This is a (relatively) quick and easy way to do simple log analysis, but the downside is that one will see only a snapshot in time and not any overall trends. To obtain the best data, it’s crucial to use either proprietary tools or the ELK Stack.

Splunk and Sumo-Logic are proprietary log analysis tools that are primarily used by enterprise companies. The ELK Stack is a free and open-source batch of three platforms (Elasticsearch, Logstash, and Kibana) that is owned by Elastic and used more often by smaller businesses. (Disclosure: We at Logz.io use the ELK Stack to monitor our own internal systems as well as for the basis of our own log management software.)

For those who are interested in using this process to do technical SEO analysis, monitor system or application performance, or for any other reason, our CEO, Tomer Levy, has written a guide to deploying the ELK Stack.

Technical SEO insights in log data

However you choose to access and understand your log data, there are many important technical SEO issues to address as needed. I’ve included screenshots of our technical SEO dashboard with our own website’s data to demonstrate what to examine in your logs.

Bot crawl volume

It’s important to know the number of requests made by Baidu, BingBot, GoogleBot, Yahoo, Yandex, and others over a given period time. If, for example, you want to get found in search in Russia but Yandex is not crawling your website, that is a problem. (You’d want to consult Yandex Webmaster and see this article on Search Engine Land.)

Response code errors

Moz has a great primer on the meanings of the different status codes. I have an alert system setup that tells me about 4XX and 5XX errors immediately because those are very significant.

Temporary redirects

Temporary 302 redirects do not pass along the “link juice” of external links from the old URL to the new one. Almost all of the time, they should be changed to permanent 301 redirects.

Crawl budget waste

Google assigns a crawl budget to each website based on numerous factors. If your crawl budget is, say, 100 pages per day (or the equivalent amount of data), then you want to be sure that all 100 are things that you want to appear in the SERPs. No matter what you write in your robots.txt file and meta-robots tags, you might still be wasting your crawl budget on advertising landing pages, internal scripts, and more. The logs will tell you — I’ve outlined two script-based examples in red above.

If you hit your crawl limit but still have new content that should be indexed to appear in search results, Google may abandon your site before finding it.

Duplicate URL crawling

The addition of URL parameters — typically used in tracking for marketing purposes — often results in search engines wasting crawl budgets by crawling different URLs with the same content. To learn how to address this issue, I recommend reading the resources on Google and Search Engine Land here, here, here, and here.

Crawl priority

Google might be ignoring (and not crawling or indexing) a crucial page or section of your website. The logs will reveal what URLs and/or directories are getting the most and least attention. If, for example, you have published an e-book that attempts to rank for targeted search queries but it sits in a directory that Google only visits once every six months, then you won’t get any organic search traffic from the e-book for up to six months.

If a part of your website is not being crawled very often — and it is updated often enough that it should be — then you might need to check your internal-linking structure and the crawl-priority settings in your XML sitemap.

Last crawl date

Have you uploaded something that you hope will be indexed quickly? The log files will tell you when Google has crawled it.

Crawl budget

One thing I personally like to check and see is Googlebot’s real-time activity on our site because the crawl budget that the search engine assigns to a website is a rough indicator — a very rough one — of how much it “likes” your site. Google ideally does not want to waste valuable crawling time on a bad website. Here, I had seen that Googlebot had made 154 requests of our new startup’s website over the prior twenty-four hours. Hopefully, that number will go up!

As I hope you can see, log analysis is critically important in technical SEO. It’s eleven o’clock — do you know where your logs are now?

Additional resources

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 3 years ago from tracking.feedpress.it

Why the Majestic name change?

Everyone we’ve spoken to on the road over the last couple of weeks has welcomed the announcement about us dropping the SEO from our name. For those of you who we haven’t had the chance to explain to in person, here’s the lowdown on why we’re now Majestic.com. It’s always a risk for any company…

The post Why the Majestic name change? appeared first on Majestic Blog.

Reblogged 4 years ago from blog.majestic.com

The Month Google Shook the SERPs

Posted by Dr-Pete

As a group, we SEOs still tend to focus most of our attention on just one place – traditional, organic results. In the past two years, I’ve spent a lot of time studying these results and how they change over time. The more I experience the reality of SERPs in the wild, though, the more I’ve become interested in situations like this one (a search for “diabetes symptoms”)…

See the single blue link and half-snippet on the bottom-left? That’s the only thing about this above-the-fold page that most SEOs in 2014 would call “organic”. Of course, it’s easy to find fringe cases, but the deeper I dig into the feature landscape that surrounds and fundamentally alters SERPs, the more I find that the exceptions are inching gradually closer to the rule.

Monday, July 28th was my 44th birthday, and I think Google must have decided to celebrate by giving me extra work (hooray for job security?). In the month between June 28th and July 28th, there were four major shake-ups to the SERPs, all of them happening beyond traditional, organic results. This post is a recap of our data on each of those shake-ups.

Authorship photos disappear (June 28)

On June 25th, Google’s John Mueller made a surprise announcement via Google+:

We had seen 
authorship shake-ups in the past, but the largest recent drop had measured around 15%. It was clear that Google was rethinking the prevalence of author photos and their impact on perceived quality, but most of us assumed this would be a process of small tweaks. Given Google’s push toward Google+ and its inherent tie-in with authorship, not a single SEO I know had predicted a complete loss of authorship photos.

Yet, over the next few days, culminating on the morning of June 28th, a 
total loss of authorship photos is exactly what happened:

While some authorship photos still appeared in personalized results, the profile photos completely disappeared from general results, after previously being present on about 21% of the SERPs that MozCast tracks. It’s important to note that the concept of authorship remains, and author bylines are still being shown (we track that at about 24%, as of this writing), but the overall visual impact was dramatic for many SERPs.

In-depth gets deeper (July 2nd)

Most SEOs still don’t pay much attention to Google’s “In-depth Articles,” but they’ve been slowly gain SERP share. When we first started tracking them, they popped up on about 3.5% of the searches MozCast covers. This data seems to only get updated periodically, and the number had grown to roughly 6.0% by the end of June 2014. On the morning of July 2nd, I (and, seemingly, everyone else), missed a major change:

Overnight, the presence of in-depth articles jumped from 6.0% to 12.7%, more than doubling (a +112% increase, to be precise). Some examples of queries that gained in-depth articles include:

  • xbox 360
  • hotels
  • raspberry pi
  • samsung galaxy tab
  • job search
  • pilates
  • payday loans
  • apartments
  • car sales
  • web design

Here’s an example set of in-depth for a term SEOs know all too well, “payday loans”:

The motivation for this change is unclear, and it comes even as Google continues to test designs with pared down in-depth results (almost all of their tests seem to take up less space than the current design). Doubling this feature hardly indicates a lack of confidence, though, and many competitive terms are now showing in-depth results.

Video looks more like radio (July 16th)

Just a couple of weeks after the authorship drop, we saw a smaller but still significant shake-up in video results, with about 28% of results MozCast tracks losing video thumbnails:

As you can see, the presence of thumbnails does vary day-to-day, but the two plateaus, before and after June 16th, are clear here. At this point, the new number seems to be holding.

Since our data doesn’t connect the video thumbnails to specific results, it’s tough to say if this change indicates a removal of thumbnails or a drop in rankings for video results overall. Considering how smaller drops in authorship signaled a much larger change down the road, I think this shift deserves more attention. It could be that Google is generally questioning the value and prevalence of rich snippets, especially when quality concerns come into play.

I originally hypothesized that this might not be a true loss, but could be a sign that some video snippets were switching to the new “mega-video” format (or video answer box, if you prefer). This does not appear to be the case, as the larger video format is still fairly uncommon, and the numbers don’t match up.

For reference, here’s a mega-video format (for the query “bartender”):

Mega-videos are appearing on such seemingly generic queries as “partition”, “headlights”, and “california king bed”. If you have the budget and really want to dominate the SERPs, try writing a pop song.

Pigeons attack local results (July 24th)

By now, many of you have heard of 
Google’s “Pigeon” update. The Pigeon update hit local SERPs hard and seems to have dramatically changed how Google determines and uses a searcher’s location. Local search is more than an algorithmic layer, though – it’s also a feature set. When Pigeon hit, we saw a sharp decline in local “pack” results (the groups of 2-7 pinned local results):

We initially reported that pack results dropped more than 60% after the Pigeon update. We now are convinced that this was a mistake (indicated by the “?” zone) – essentially, Pigeon changed localization so much that it broke the method we were using. We’ve found a new method that seems to match manually setting your location, and the numbers for July 29-30 are, to the best of my knowledge, accurate.

According to these new numbers, local pack results have fallen 23.4% (in our data set) after the Pigeon update. This is the exact same number 
Darren Shaw of WhiteSpark found, using a completely different data set and methodology. The perfect match between those two numbers is probably a bit of luck, but they suggest that we’re at least on the right track. While I over-reported the initial drop, and I apologize for any confusion that may have caused, the corrected reality still shows a substantial change in pack results.

It’s important to note that this 23.4% drop is a net change – among queries, there were both losers and winners. Here are 10 searches that lost pack results (and have been manually verified):

  • jobs
  • cars for sale
  • apartments
  • cruises
  • train tickets
  • sofa
  • wheels
  • liposuction
  • social security card
  • motorcycle helmets

A couple of important notes – first, some searches that lost packs only lost packs in certain regions. Second, Pigeon is a very recent update and may still be rolling out or being tweaked. This is only the state of the data as we know it today.

Here are 10 searches that gained pack results (in our data set):

  • skechers
  • mortgage
  • apartments for rent
  • web designer
  • long john silvers
  • lamps
  • mystic
  • make a wish foundation
  • va hospital
  • internet service

The search for “mystic” is an interesting example – no matter what your location (if you’re in the US), Google is showing a pack result for Mystic, CT. This pattern seems to be popping up across the Pigeon update. For example, a search for “California Pizza Kitchen” automatically targets California, regardless of your location (h/t 
Tony Verre), and a search for “Buffalo Wild Wings” sends you to Buffalo, NY (h/t Andrew Mitschke).

Of course, local search is complex, and it seems like Google is trying to do a lot in one update. The simple fact that a search for “apartments” lost pack results in our data, while “apartments for rent” gained them, shows that the Pigeon update isn’t based on a few simplistic rules.

Some local SEOs have commented that Pigeon seemed to increase the number of smaller packs (2-3 results). Looking at the data for pack size before and after Pigeon, this is what we’re seeing:

Both before and after Pigeon, there are no 1-packs, and 4-, 5-, and 6-packs are relatively rare. After Pigeon, the distribution of 2-packs is similar, but there is a notable jump in 3-packs and a corresponding decrease in 7-packs. The total number of 3-packs actually increased after the Pigeon update. While our data set (once we restrict it to just searches with pack results) is fairly small, this data does seem to match the observations of local SEOs.

Sleep with one eye open

Ok, maybe that’s a bit melodramatic. All of the changes do go to show, though, that, if you’re laser-focused on ranking alone, you may be missing a lot. We as SEOs not only need to look beyond our own tunnel vision, we need to start paying more attention to post-ranking data, like CTR and search traffic. SERPs are getting richer and more dynamic, and Google can change the rules overnight.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from feedproxy.google.com

Your Google Algorithm Cheat Sheet: Panda, Penguin, and Hummingbird

Posted by MarieHaynes

If you’re reading the Moz blog, then you probably have a decent understanding of Google and its algorithm changes. However, there is probably a good percentage of the Moz audience that is still confused about the effects that Panda, Penguin, and Hummingbird can have on your site. I did write a post last year about the main 
differences between Penguin and a Manual Unnautral Links Penalty, and if you haven’t read that, it’ll give you a good primer.

The point of this article is to explain very simply what each of these algorithms are meant to do. It is hopefully a good reference that you can point your clients to if you want to explain an algorithm change and not overwhelm them with technical details about 301s, canonicals, crawl errors, and other confusing SEO terminologies.

What is an algorithm change?

First of all, let’s start by discussing the Google algorithm. It’s immensely complicated and continues to get more complicated as Google tries its best to provide searchers with the information that they need. When search engines were first created, early search marketers were able to easily find ways to make the search engine think that their client’s site was the one that should rank well. In some cases it was as simple as putting in some code on the website called a meta keywords tag. The meta keywords tag would tell search engines what the page was about.

As Google evolved, its engineers, who were primarily focused on making the search engine results as relevant to users as possible, continued to work on ways to stop people from cheating, and looked at other ways to show the most relevant pages at the top of their searches. The algorithm now looks at hundreds of different factors. There are some that we know are significant such as having a good descriptive title (between the <title></title> tags in the code.) And there are many that are the subject of speculation such as 
whether or not Google +1’s contribute to a site’s rankings.

In the past, the Google algorithm would change very infrequently. If your site was sitting at #1 for a certain keyword, it was guaranteed to stay there until the next update which might not happen for weeks or months. Then, they would push out another update and things would change. They would stay that way until the next update happened. If you’re interested in reading about how Google used to push updates out of its index, you may find this 
Webmaster World forum thread from 2002 interesting. (Many thanks to Paul Macnamara  for explaining to me how algo changes used to work on Google in the past and pointing me to the Webmaster World thread.)

This all changed with launch of “Caffeine” in 2010. Since Caffeine launched, the search engine results have been changing several times a day rather than every few weeks. Google makes over 600 changes to its algorithm in a year, and the vast majority of these are not announced. But, when Google makes a really big change, they give it a name, usually make an announcement, and everyone in the SEO world goes crazy trying to figure out how to understand the changes and use them to their advantage.

Three of the biggest changes that have happened in the last few years are the Panda algorithm, the Penguin algorithm and Hummingbird.

What is the Panda algorithm?

Panda first launched on February 23, 2011. It was a big deal. The purpose of Panda was to try to show high-quality sites higher in search results and demote sites that may be of lower quality. This algorithm change was unnamed when it first came out, and many of us called it the “Farmer” update as it seemed to affect content farms. (Content farms are sites that aggregate information from many sources, often stealing that information from other sites, in order to create large numbers of pages with the sole purpose of ranking well in Google for many different keywords.) However, it affected a very large number of sites. The algorithm change was eventually officially named after one of its creators, Navneet Panda.

When Panda first happened, a lot of SEOs in forums thought that this algorithm was targeting sites with unnatural backlink patterns. However, it turns out that links are most likely
not a part of the Panda algorithm. It is all about on-site quality.

In most cases, sites that were affected by Panda were hit quite hard. But, I have also seen sites that have taken a slight loss on the date of a Panda update. Panda tends to be a site-wide issue which means that it doesn’t just demote certain pages of your site in the search engine results, but instead, Google considers the entire site to be of lower quality. In some cases though Panda can affect just a section of a site such as a news blog or one particular subdomain.

Whenever a Google employee is asked about what needs to be done to recover from Panda, they refer to a 
blog post by Google Employee Amit Singhal that gives a checklist that you can use on your site to determine if your site really is high quality or not. Here is the list:

  • Would you trust the information presented in this article?
  • Is this article written by an expert or enthusiast who knows the topic well, or is it more shallow in nature?
  • Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?
  • Would you be comfortable giving your credit card information to this site?
  • Does this article have spelling, stylistic, or factual errors?
  • Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?
  • Does the article provide original content or information, original reporting, original research, or original analysis?
  • Does the page provide substantial value when compared to other pages in search results?
  • How much quality control is done on content?
  • Does the article describe both sides of a story?
  • Is the site a recognized authority on its topic?
  • Is the content mass-produced by or outsourced to a large number of creators, or spread across a large network of sites, so that individual pages or sites don’t get as much attention or care?
  • Was the article edited well, or does it appear sloppy or hastily produced?
  • For a health related query, would you trust information from this site?
  • Would you recognize this site as an authoritative source when mentioned by name?
  • Does this article provide a complete or comprehensive description of the topic?
  • Does this article contain insightful analysis or interesting information that is beyond obvious?
  • Is this the sort of page you’d want to bookmark, share with a friend, or recommend?
  • Does this article have an excessive amount of ads that distract from or interfere with the main content?
  • Would you expect to see this article in a printed magazine, encyclopedia or book?
  • Are the articles short, unsubstantial, or otherwise lacking in helpful specifics?
  • Are the pages produced with great care and attention to detail vs. less attention to detail?
  • Would users complain when they see pages from this site?

Phew! That list is pretty overwhelming! These questions do not necessarily mean that Google tries to algorithmically figure out whether your articles are interesting or whether you have told both sides of a story. Rather, the questions are there because all of these factors can contribute to how real-life users would rate the quality of your site. No one really knows all of the factors that Google uses in determining the quality of your site through the eyes of Panda. Ultimately though, the focus is on creating the best site possible for your users.  It is also important that only your best stuff is given to Google to have in its index. There are a few factors that are widely accepted as important things to look at in regards to Panda:

Thin content

A “thin” page is a page that adds little or no value to someone who is reading it. It doesn’t necessarily mean that a page has to be a certain number of words, but quite often, pages with very few words are not super-helpful. If you have a large number of pages on your site that contain just one or two sentences and those pages are all included in the Google index, then the Panda algorithm may determine that the majority of your indexed pages are of low quality.

Having the odd thin page is not going to cause you to run in to Panda problems. But, if a big enough portion of your site contains pages that are not helpful to users, then that is not good.

Duplicate content

There are several ways that duplicate content can cause your site to be viewed as a low-quality site by the Panda algorithm. The first is when a site has a large amount of content that is copied from other sources on the web. Let’s say that you have a blog on your site and you populate that blog with articles that are taken from other sources. Google is pretty good at figuring out that you are not the creator of this content. If the algorithm can see that a large portion of your site is made up of content that exists on other sites then this can cause Panda to look at you unfavorably.

You can also run into problems with duplicated content on your own site. One example would be for a site that has a large number of products for sale. Perhaps each product has a separate page for each color variation and size. But, all of these pages are essentially the same. If one product comes in 20 different colors and each of those come in 6 different sizes, then that means that you have 120 pages for the same product, all of which are almost identical. Now, imagine that you sell 4,000 products. This means that you’ve got almost half a million pages in the Google index when really 4,000 pages would suffice. In this type of situation, the fix for this problem is to use something called a canonical tag. Moz has got a really good guide on using canonical tags 
here, and Dr. Pete has also written this great article on canonical tag use

Low-quality content

When I write an article and publish it on one of my websites, the only type of information that I want to present to Google is information that is the absolute best of its kind. In the past, many SEOs have given advice to site owners saying that it was important to blog every day and make sure that you are always adding content for Google to index. But, if what you are producing is not high quality content, then you could be doing more harm than good. A lot of Amit Singhal’s questions listed above are asking whether the content on your site is valuable to readers. Let’s say that I have an SEO blog and every day I take a short blurb from each of the interesting SEO articles that I have read online and publish it as a blog post on my site. Is Google going to want to show searchers my summary of these articles, or would they rather show them the actual articles? Of course my summary is not going to be as valuable as the real thing! Now, let’s say that I have done this every day for 4 years. Now my site has over 4,000 pages that contain information that is not unique and not as valuable as other sites on the same topics.

Here is another example. Let’s say that I am a plumber. I’ve been told that I should blog regularly, so several times a week I write a 2-3 paragraph article on things like, “How to fix a leaky faucet” or “How to unclog a toilet.” But, I’m busy and don’t have much time to put into my website so each article I’ve written contains keywords in the title and a few times in the content, but the content is not in depth and is not that helpful to readers. If the majority of the pages on my site contain information that no one is engaging with, then this can be a sign of low quality in the eyes of the Panda algorithm.

There are other factors that probably play a roll in the Panda algorithm.  Glenn Gabe recently wrote an 
excellent article on his evaluation of sites affected by the most recent Panda update.  His bullet point list of things to improve upon when affected by Panda is extremely thorough.

How to recover from a Panda hit

Google refreshes the Panda algorithm approximately monthly. They used to announce whenever they were refreshing the algorithm, but now they only do this if there is a really big change to the Panda algorithm. What happens when the Panda algorithm refreshes is that Google takes a new look at each site on the web and determines whether or not it looks like a quality site in regards to the criteria that the Panda algorithm looks at. If your site was adversely affected by Panda and you have made changes such as removing thin and duplicate content then, when Panda refreshes, you should see that things improve. However, for some sites it can take a couple of Panda refreshes to see the full extent of the improvements. This is because it can sometimes take several months for Google to revisit all of your pages and recognize the changes that you have made.

Every now and then, instead of just
refreshing the algorithm, Google does what they call an update. When an update happens, this means that Google has changed the criteria that they use to determine what is and isn’t considered high quality. On May 20, 2014, Google did a major update which they called Panda 4.0. This caused a lot of sites to see significant changes in regards to Panda:

Not all Panda recoveries are as dramatic as this one. But, if you have been affected by Panda and you work hard to make changes to your site, you really should see some improvement.

What is the Penguin algorithm?

Penguin

The Penguin algorithm initially rolled out on April 24, 2012. The goal of Penguin is to reduce the trust that Google has in sites that have cheated by creating unnatural backlinks in order to gain an advantage in the Google results. While the primary focus of Penguin is on unnatural links, there can be other 
factors that can affect a site in the eyes of Penguin as well. Links, though, are known to be by far the most important thing to look at.

Why are links important?

A link is like a vote for your site. If a well respected site links to your site, then this is a recommendation for your site. If a small, unknown site links to you then this vote is not going to count for as much as a vote from an authoritative site. Still, if you can get a large number of these small votes, they really can make a difference. This is why, in the past, SEOs would try to get as many links as they could from any possible source.

Another thing that is important in the Google algorithms is anchor text. Anchor text is the text that is underlined in a link. So, in this link to a great 
SEO blog, the anchor text would be “SEO blog.” If Moz.com gets a number of sites linking to them using the anchor text “SEO blog,” that is a hint to Google that people searching for “SEO blog” probably want to see sites like Moz in their search results.

It’s not hard to see how people could manipulate this part of the algorithm. Let’s say that I am doing SEO for a landscaping company in Orlando. In the past, one of the ways that I could cheat the algorithm into thinking that my company should be ranked highly would be to create a bunch of self made links and use anchor text in these links that contain phrases like
Orlando Landscaping Company, Landscapers in Orlando and Orlando Landscaping. While an authoritative link from a well respected site is good, what people discovered is that creating a large number of links from low quality sites was quite effective. As such, what SEOs would do is create links from easy to get places like directory listings, self made articles, and links in comments and forum posts.

While we don’t know exactly what factors the Penguin algorithm looks at, what we do know is that this type of low quality, self made link is what the algorithm is trying to detect. In my mind, the Penguin algorithm is sort of like Google putting a “trust factor” on your links. I used to tell people that Penguin could affect a site on a page or even a keyword level, but Google employee John Mueller has said several times now that Penguin is a sitewide algorithm. This means that if the Penguin algorithm determines that a large number of the links to your site are untrustworthy, then this reduces Google’s trust in your entire site. As such, the whole site will see a reduction in rankings.  

While Penguin affected a lot of sites drastically, I have seen many sites that saw a small reduction in rankings.  The difference, of course, depends on the amount of link manipulation that has been done.

How to recover from a Penguin hit?

Penguin is a filter just like Panda. What that means, is that the algorithm is re-run periodically and sites are re-evaluated with each re-run. At this point it is not run very often at all. The last update was October 4, 2013 which means that we have currently been waiting eight months for a new Penguin update. In order to recover from Penguin, you need to identify the unnatural links pointing to your site and either remove them, or if you can’t remove them you can ask Google to no longer count them by using the 
disavow tool. Then, the next time that Penguin refreshes or updates, if you have done a good enough job at cleaning up your unnatural links, you will once again regain trust in Google’s eyes.  In some cases, it can take a couple of refreshes in order for a site to completely escape Penguin because it can take up to 6 months for all of a site’s disavow file to be completely processed.

If you are not certain how to identify which links to your site are unnatural, here are some good resources for you:

The disavow tool is something that you probably should only be using if you really understand how it works. It is potentially possible for you to do more harm than good to your site if you disavow the wrong links. Here is some information on using the disavow tool:

It’s important to note that when sites “recover” from Penguin, they often don’t skyrocket up to top rankings once again as those previously high rankings were probably based on the power of links that are now considered unnatural. Here is some information on 
what to expect when you have recovered from a link based penalty or algorithmic issue.

Also, the Penguin algorithm is not the same thing as a manual unnatural links penalty. You do not need to file a reconsideration request to recover from Penguin. You also do not need to document the work that you have done in order to get links removed as no Google employee will be manually reviewing your work. As mentioned previously, here is more information on the 
difference between the Penguin algorithm and a manual unnatural links penalty.

What is Hummingbird?

Hummingbird is a completely different animal than Penguin or Panda. (Yeah, I know…that was a bad pun.) I will commonly get people emailing me telling me that Hummingbird destroyed their rankings. I would say that in almost every case that I have evalutated, this was not true. Google made their announcement about Hummingbird on September 26, 2013. However, at that time, they announced that Hummingbird had already been live for about a month. If the Hummingbird algorithm was truly responsible for catastrophic ranking fluctuations then we really should have seen an outcry from the SEO world of something drastic happening in August of 2013, and this did not happen. There did seem to be some type of fluctuation that happened around August 21 as reported here on Search Engine Round Table, but there were not many sites that reported huge ranking changes on that day.

If you think that Hummingbird affected you, it’s not a bad idea to look at your traffic to see if you noticed a drop on October 4, 2013 which was actually a refresh of the Penguin algorithm. I believe that a lot of people who thought that they were affected by Hummingbird were actually affected by Penguin which happened just a week after Google made their announcement about Hummingbird.

There are some excellent articles on Hummingbird here and here. Hummingbird was a complete overhaul of the entire Google algorithm. As Danny Sullivan put it, if you consider the Google algorithm as an engine, Panda and Penguin are algorithm changes that were like putting a new part in the engine such as a filter or a fuel pump. But, Hummingbird wasn’t just a new part; it was a completely new engine. That new engine still makes use of many of the old parts (such as Panda and Penguin) but a good amount of the engine is completely original.

The goal of the Hummingbird algorithm is for Google to better understand a user’s query. Bill Slawski who writes about Google patents has a great example of this in his post here. He explains that when someone searches for “What is the best place to find and eat Chicago deep dish style pizza?”, Hummingbird is able to discern that by “place” the user likely would be interested in results that show “restaurants”. There is speculation that these changes were necessary in order for Google’s voice search to be more effective. When we’re typing a search query, we might type, “best Seattle SEO company” but when we’re speaking a query (i.e. via Google Glass or via Google Now) we’re more likely to say something like, “Which firm in Seattle offers the best SEO services?” The point of Hummingbird is to better understand what users mean when they have queries like this.

So how do I recover or improve in the eyes of Hummingbird?

If you read the posts referenced above, the answer to this question is essentially to create content that answers users queries rather than just trying to rank for a particular keyword. But really, this is what you should already be doing!

It appears that Google’s goal with all of these algorithm changes (Panda, Penguin and Hummingbird) is to encourage webmasters to publish content that is the best of its kind. Google’s goal is to deliver answers to people who are searching. If you can produce content that answers people’s questions, then you’re on the right track.

I know that that is a really vague answer when it comes to “recovering” from Hummingbird. Hummingbird really is different than Panda and Penguin. When a site has been demoted by the Panda or Penguin algorithm, it’s because Google has lost some trust in the site’s quality, whether it is on-site quality or the legitimacy of its backlinks. If you fix those quality issues you can regain the algorithm’s trust and subsequently see improvements. But, if your site seems to be doing poorly since the launch of Hummingbird, then there really isn’t a way to recover those keyword rankings that you once held. You can, however, get new traffic by finding ways to be more thorough and complete in what your website offers.

Do you have more questions?

My goal in writing this article was to have a resource to point people to when they had basic questions about Panda, Penguin and Hummingbird. Recently, when I published my penalty newsletter, I had a small business owner comment that it was very interesting but that most of it went over their head. I realized that many people outside of the SEO world are greatly affected by these algorithm changes, but don’t have much information on why they have affected their website.

Do you have more questions about Panda, Penguin or Hummingbird? If so, I’d be happy to address them in the comments. I also would love for those of you who are experienced with dealing with websites affected by these issues to comment as well.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from feedproxy.google.com