Spam Score: Moz’s New Metric to Measure Penalization Risk

Posted by randfish

Today, I’m very excited to announce that Moz’s Spam Score, an R&D project we’ve worked on for nearly a year, is finally going live. In this post, you can learn more about how we’re calculating spam score, what it means, and how you can potentially use it in your SEO work.

How does Spam Score work?

Over the last year, our data science team, led by 
Dr. Matt Peters, examined a great number of potential factors that predicted that a site might be penalized or banned by Google. We found strong correlations with 17 unique factors we call “spam flags,” and turned them into a score.

Almost every subdomain in 
Mozscape (our web index) now has a Spam Score attached to it, and this score is viewable inside Open Site Explorer (and soon, the MozBar and other tools). The score is simple; it just records the quantity of spam flags the subdomain triggers. Our correlations showed that no particular flag was more likely than others to mean a domain was penalized/banned in Google, but firing many flags had a very strong correlation (you can see the math below).

Spam Score currently operates only on the subdomain level—we don’t have it for pages or root domains. It’s been my experience and the experience of many other SEOs in the field that a great deal of link spam is tied to the subdomain-level. There are plenty of exceptions—manipulative links can and do live on plenty of high-quality sites—but as we’ve tested, we found that subdomain-level Spam Score was the best solution we could create at web scale. It does a solid job with the most obvious, nastiest spam, and a decent job highlighting risk in other areas, too.

How to access Spam Score

Right now, you can find Spam Score inside 
Open Site Explorer, both in the top metrics (just below domain/page authority) and in its own tab labeled “Spam Analysis.” Spam Score is only available for Pro subscribers right now, though in the future, we may make the score in the metrics section available to everyone (if you’re not a subscriber, you can check it out with a free trial). 

The current Spam Analysis page includes a list of subdomains or pages linking to your site. You can toggle the target to look at all links to a given subdomain on your site, given pages, or the entire root domain. You can further toggle source tier to look at the Spam Score for incoming linking pages or subdomains (but in the case of pages, we’re still showing the Spam Score for the subdomain on which that page is hosted).

You can click on any Spam Score row and see the details about which flags were triggered. We’ll bring you to a page like this:

Back on the original Spam Analysis page, at the very bottom of the rows, you’ll find an option to export a disavow file, which is compatible with Google Webmaster Tools. You can choose to filter the file to contain only those sites with a given spam flag count or higher:

Disavow exports usually take less than 3 hours to finish. We can send you an email when it’s ready, too.

WARNING: Please do not export this file and simply upload it to Google! You can really, really hurt your site’s ranking and there may be no way to recover. Instead, carefully sort through the links therein and make sure you really do want to disavow what’s in there. You can easily remove/edit the file to take out links you feel are not spam. When Moz’s Cyrus Shepard disavowed every link to his own site, it took more than a year for his rankings to return!

We’ve actually made the file not-wholly-ready for upload to Google in order to be sure folks aren’t too cavalier with this particular step. You’ll need to open it up and make some edits (specifically to lines at the top of the file) in order to ready it for Webmaster Tools

In the near future, we hope to have Spam Score in the Mozbar as well, which might look like this: 

Sweet, right? 🙂

Potential use cases for Spam Analysis

This list probably isn’t exhaustive, but these are a few of the ways we’ve been playing around with the data:

  1. Checking for spammy links to your own site: Almost every site has at least a few bad links pointing to it, but it’s been hard to know how much or how many potentially harmful links you might have until now. Run a quick spam analysis and see if there’s enough there to cause concern.
  2. Evaluating potential links: This is a big one where we think Spam Score can be helpful. It’s not going to catch every potentially bad link, and you should certainly still use your brain for evaluation too, but as you’re scanning a list of link opportunities or surfing to various sites, having the ability to see if they fire a lot of flags is a great warning sign.
  3. Link cleanup: Link cleanup projects can be messy, involved, precarious, and massively tedious. Spam Score might not catch everything, but sorting links by it can be hugely helpful in identifying potentially nasty stuff, and filtering out the more probably clean links.
  4. Disavow Files: Again, because Spam Score won’t perfectly catch everything, you will likely need to do some additional work here (especially if the site you’re working on has done some link buying on more generally trustworthy domains), but it can save you a heap of time evaluating and listing the worst and most obvious junk.

Over time, we’re also excited about using Spam Score to help improve the PA and DA calculations (it’s not currently in there), as well as adding it to other tools and data sources. We’d love your feedback and insight about where you’d most want to see Spam Score get involved.

Details about Spam Score’s calculation

This section comes courtesy of Moz’s head of data science, Dr. Matt Peters, who created the metric and deserves (at least in my humble opinion) a big round of applause. – Rand

Definition of “spam”

Before diving into the details of the individual spam flags and their calculation, it’s important to first describe our data gathering process and “spam” definition.

For our purposes, we followed Google’s definition of spam and gathered labels for a large number of sites as follows.

  • First, we randomly selected a large number of subdomains from the Mozscape index stratified by mozRank.
  • Then we crawled the subdomains and threw out any that didn’t return a “200 OK” (redirects, errors, etc).
  • Finally, we collected the top 10 de-personalized, geo-agnostic Google-US search results using the full subdomain name as the keyword and checked whether any of those results matched the original keyword. If they did not, we called the subdomain “spam,” otherwise we called it “ham.”

We performed the most recent data collection in November 2014 (after the Penguin 3.0 update) for about 500,000 subdomains.

Relationship between number of flags and spam

The overall Spam Score is currently an aggregate of 17 different “flags.” You can think of each flag a potential “warning sign” that signals that a site may be spammy. The overall likelihood of spam increases as a site accumulates more and more flags, so that the total number of flags is a strong predictor of spam. Accordingly, the flags are designed to be used together—no single flag, or even a few flags, is cause for concern (and indeed most sites will trigger at least a few flags).

The following table shows the relationship between the number of flags and percent of sites with those flags that we found Google had penalized or banned:

ABOVE: The overall probability of spam vs. the number of spam flags. Data collected in Nov. 2014 for approximately 500K subdomains. The table also highlights the three overall danger levels: low/green (< 10%) moderate/yellow (10-50%) and high/red (>50%)

The overall spam percent averaged across a large number of sites increases in lock step with the number of flags; however there are outliers in every category. For example, there are a small number of sites with very few flags that are tagged as spam by Google and conversely a small number of sites with many flags that are not spam.

Spam flag details

The individual spam flags capture a wide range of spam signals link profiles, anchor text, on page signals and properties of the domain name. At a high level the process to determine the spam flags for each subdomain is:

  • Collect link metrics from Mozscape (mozRank, mozTrust, number of linking domains, etc).
  • Collect anchor text metrics from Mozscape (top anchor text phrases sorted by number of links)
  • Collect the top five pages by Page Authority on the subdomain from Mozscape
  • Crawl the top five pages plus the home page and process to extract on page signals
  • Provide the output for Mozscape to include in the next index release cycle

Since the spam flags are incorporated into in the Mozscape index, fresh data is released with each new index. Right now, we crawl and process the spam flags for each subdomains every two – three months although this may change in the future.

Link flags

The following table lists the link and anchor text related flags with the the odds ratio for each flag. For each flag, we can compute two percents: the percent of sites with that flag that are penalized by Google and the percent of sites with that flag that were not penalized. The odds ratio is the ratio of these percents and gives the increase in likelihood that a site is spam if it has the flag. For example, the first row says that a site with this flag is 12.4 times more likely to be spam than one without the flag.

ABOVE: Description and odds ratio of link and anchor text related spam flags. In addition to a description, it lists the odds ratio for each flag which gives the overall increase in spam likelihood if the flag is present).

Working down the table, the flags are:

  • Low mozTrust to mozRank ratio: Sites with low mozTrust compared to mozRank are likely to be spam.
  • Large site with few links: Large sites with many pages tend to also have many links and large sites without a corresponding large number of links are likely to be spam.
  • Site link diversity is low: If a large percentage of links to a site are from a few domains it is likely to be spam.
  • Ratio of followed to nofollowed subdomains/domains (two separate flags): Sites with a large number of followed links relative to nofollowed are likely to be spam.
  • Small proportion of branded links (anchor text): Organically occurring links tend to contain a disproportionate amount of banded keywords. If a site does not have a lot of branded anchor text, it’s a signal the links are not organic.

On-page flags

Similar to the link flags, the following table lists the on page and domain name related flags:

ABOVE: Description and odds ratio of on page and domain name related spam flags. In addition to a description, it lists the odds ratio for each flag which gives the overall increase in spam likelihood if the flag is present).

  • Thin content: If a site has a relatively small ratio of content to navigation chrome it’s likely to be spam.
  • Site mark-up is abnormally small: Non-spam sites tend to invest in rich user experiences with CSS, Javascript and extensive mark-up. Accordingly, a large ratio of text to mark-up is a spam signal.
  • Large number of external links: A site with a large number of external links may look spammy.
  • Low number of internal links: Real sites tend to link heavily to themselves via internal navigation and a relative lack of internal links is a spam signal.
  • Anchor text-heavy page: Sites with a lot of anchor text are more likely to be spam then those with more content and less links.
  • External links in navigation: Spam sites may hide external links in the sidebar or footer.
  • No contact info: Real sites prominently display their social and other contact information.
  • Low number of pages found: A site with only one or a few pages is more likely to be spam than one with many pages.
  • TLD correlated with spam domains: Certain TLDs are more spammy than others (e.g. pw).
  • Domain name length: A long subdomain name like “bycheapviagra.freeshipping.onlinepharmacy.com” may indicate keyword stuffing.
  • Domain name contains numerals: domain names with numerals may be automatically generated and therefore spam.

If you’d like some more details on the technical aspects of the spam score, check out the 
video of Matt’s 2012 MozCon talk about Algorithmic Spam Detection or the slides (many of the details have evolved, but the overall ideas are the same):

We’d love your feedback

As with all metrics, Spam Score won’t be perfect. We’d love to hear your feedback and ideas for improving the score as well as what you’d like to see from it’s in-product application in the future. Feel free to leave comments on this post, or to email Matt (matt at moz dot com) and me (rand at moz dot com) privately with any suggestions.

Good luck cleaning up and preventing link spam!



Not a Pro Subscriber? No problem!



Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

Announcing the New &amp; Improved Link Intersect Tool

Posted by randfish

Y’all remember how last October, we launched a new section in Open Site Explorer called “Link Opportunities?” While I was proud of that work, there was one section that really disappointed me at the time (and I said as much in my comments on the post).

Well, today, that disappointment is over, because we’re stepping up the Link Intersect tool inside OSE big time:

Literally thousands of sweet, sweet link opportunities are now yours at the click of a button

In the initial launch, Link Intersect used Freshscape (which powers Fresh Web Explorer). Freshscape is great for certain kinds of data – links and mentions that come from newly published pages that are in news sources, blogs, and feeds. But it’s not great for non-news/blogs/feed sources because it’s intentionally avoiding those!

For example, in the screenshot above, I wanted to see all the pages that link to SeriousEats.com and SplendidTable.org but don’t link to SmittenKitchen.com.

That’s 671 more, juicy link opportunities thanks to the hard work of the Moz Big Data and Research Tools teams.

How does the new Link Intersect work?

The tool looks at the top 250,000 links our index has pointing to each of the intersecting targets you enter, and the top 1 mllion links in our index pointing to the excluded URL.

Link Intersect then runs a differential comparison to determine which of the 250K links to each of the intersecting targets are from the same URL or root domain, and removes any of those links that point to the top million links to the excluded URL/root/sub domain.

This means it’s possible for sites and pages with massive quantities of links that we won’t show every intersecting link we know about, but since the sorting is in Page Authority order, you’ll get the highest quality/most important ones at the top.

You can use Link Intersect to see three unique views on the data:

  • Pages that link to subdomains (particularly useful if you’re interested in shared links to sites on hosted subdomains like blogspot, wordpress, etc or to a specific subdomain section of a competitor’s site)
  • Pages that link to root domains (my personal favorite, as I find the results the most comprehensive)
  • Root domains that link to the root domains (great if you’re trying to get a broad sense of domain-level outreach/marketing targets)

Note that it’s possible the root domains will actually expose more links that pages because the domain-level link graph is easier and faster to sort through, so the 250K limit is less of a barrier.

Like most of the reports in Open Site Explorer, Link Intersect comes with a handy CSV Export option:

When it finishes (my most recent one took just under 3 minutes to run and email me), you’ll get a nice email like this one:

Please ignore the grammatical errors. I’m sure our team will fix those up soon 🙂

Why are these such good link/outreach/marketing targets?

Generally speaking, this type of data is invaluable for link outreach because these sites and pages are ones that clearly care about the shared topics or content of the intersecting targets. If you enter two of your primary competitors, you’ll often get news media, blog posts, reference resources, events, trade publications, and more that produce content in your topical niche.

They’re also good targets because they actually link out! This means you can avoid sifting through sites whose policies or practices mean they’re unlikely to ever link to you – if they’ve linked to those other two chaps, why not you, too?!

Basically, you can check the trifecta of link opportunity goodness boxes (which I’ve helpfully illustrated above, because that’s just the kind of SEO dork I am).

Link Intersect is limited only by your own creativity – so long as you can keep finding sites and pages on the web whose links might also be a match for your own site, we can keep digging through trillions of links, finding the intersects, and giving them back to you.

3 examples of Link Intersect in action

Let’s look at some ways we might put this to use in the real world:

#1: I’m trying to figure out who links to my two big competitors in the world of book reviews

First off, remember that Link Intersect works on a root domain or subdomain level, so we wouldn’t want to use something like the NYTimes’ review of books, because we’d be finding all the intersections to NYTimes.com. Instead, we want to pick more topically-focused domains, like these two:

You’ll also note that I’ve used a fake website as my excluded URL – this is a great trick for when you’re simply interested in any sites/pages that link to two domains and don’t need to remove a particular target.

#2: I’ve got a locally-focused website doing plumbing and need a few link sources to help boost my potential to rank in local and organic SERPs

In this instance, I’ll certainly look at pages linking to combinations of the top ranking sites in the local results, e.g. the 15 results for this query:

This is a solid starting point, especially considering how few links local sites often need to perform well. But we can get creative by branching outside of plumbing and exploring related fields like construction:

Focusing on better-linked-to industries and websites will give more results, so we want to try to broaden rather than narrow our categories and look for the most-linked-to sites in given verticals for comparisons.

#3: I’m planning some new content around weather patterns for my air conditioning website and want to know what news and blog sites cover extreme weather content

First, I’m going to start by browsing some search results for content in this field that’s received some serious link activity. By turning on my Mozbar’s SERPs overlay, I can see the sites and pages that have generated loads of links:

Now I can run a few combinations of these through the Link Intersect Tool:

While those domain names make me fear for humanity’s intelligence and future survival, they also expose a great link opportunity tactic I hadn’t previously considered – climate science deniers and the more politically charged universe of climate science overall.


I hope you enjoy the new Link Intersect tool as much as I have been – I think it’s one of the best things we’ve put in Open Site Explorer in the last few months, though what we’re releasing in March might beat even that, so stay tuned!

And, as always, please do give us feedback and feel free to ask questions in the comments below or through the Moz Community Q+A.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

Technical Site Audit Checklist: 2015 Edition

Posted by GeoffKenyon

Back in 2011, I wrote a technical site audit checklist, and while it was thorough, there have been a lot of additions to what is encompassed in a site audit. I have gone through and updated that old checklist for 2015. Some of the biggest changes were the addition of sections for mobile, international, and site speed.

This checklist should help you put together a thorough site audit and determine what is holding back the organic performance of your site. At the end of your audit, don’t write a document that says what’s wrong with the website. Instead, create a document that says what needs to be done. Then explain why these actions need to be taken and why they are important. What I’ve found to really helpful is to provide a prioritized list along with your document of all the actions that you would like them to implement. This list can be handed off to a dev or content team to be implemented easily. These teams can refer to your more thorough document as needed.


Quick overview

Check indexed pages  
  • Do a site: search.
  • How many pages are returned? (This can be way off so don’t put too much stock in this).
  • Is the homepage showing up as the first result? 
  • If the homepage isn’t showing up as the first result, there could be issues, like a penalty or poor site architecture/internal linking, affecting the site. This may be less of a concern as Google’s John Mueller recently said that your homepage doesn’t need to be listed first.

Review the number of organic landing pages in Google Analytics

  • Does this match with the number of results in a site: search?
  • This is often the best view of how many pages are in a search engine’s index that search engines find valuable.

Search for the brand and branded terms

  • Is the homepage showing up at the top, or are correct pages showing up?
  • If the proper pages aren’t showing up as the first result, there could be issues, like a penalty, in play.
Check Google’s cache for key pages
  • Is the content showing up?
  • Are navigation links present?
  • Are there links that aren’t visible on the site?
PRO Tip:
Don’t forget to check the text-only version of the cached page. Here is a
bookmarklet to help you do that.

Do a mobile search for your brand and key landing pages

  • Does your listing have the “mobile friendly” label?
  • Are your landing pages mobile friendly?
  • If the answer is no to either of these, it may be costing you organic visits.

On-page optimization

Title tags are optimized
  • Title tags should be optimized and unique.
  • Your brand name should be included in your title tag to improve click-through rates.
  • Title tags are about 55-60 characters (512 pixels) to be fully displayed. You can test here or review title pixel widths in Screaming Frog.
Important pages have click-through rate optimized titles and meta descriptions
  • This will help improve your organic traffic independent of your rankings.
  • You can use SERP Turkey for this.

Check for pages missing page titles and meta descriptions
  
The on-page content includes the primary keyword phrase multiple times as well as variations and alternate keyword phrases
  
There is a significant amount of optimized, unique content on key pages
 
The primary keyword phrase is contained in the H1 tag
  

Images’ file names and alt text are optimized to include the primary keyword phrase associated with the page.
 
URLs are descriptive and optimized
  • While it is beneficial to include your keyword phrase in URLs, changing your URLs can negatively impact traffic when you do a 301. As such, I typically recommend optimizing URLs when the current ones are really bad or when you don’t have to change URLs with existing external links.
Clean URLs
  • No excessive parameters or session IDs.
  • URLs exposed to search engines should be static.
Short URLs
  • 115 characters or shorter – this character limit isn’t set in stone, but shorter URLs are better for usability.

Content

Homepage content is optimized
  • Does the homepage have at least one paragraph?
  • There has to be enough content on the page to give search engines an understanding of what a page is about. Based on my experience, I typically recommend at least 150 words.
Landing pages are optimized
  • Do these pages have at least a few paragraphs of content? Is it enough to give search engines an understanding of what the page is about?
  • Is it template text or is it completely unique?
Site contains real and substantial content
  • Is there real content on the site or is the “content” simply a list of links?
Proper keyword targeting
  • Does the intent behind the keyword match the intent of the landing page?
  • Are there pages targeting head terms, mid-tail, and long-tail keywords?
Keyword cannibalization
  • Do a site: search in Google for important keyword phrases.
  • Check for duplicate content/page titles using the Moz Pro Crawl Test.
Content to help users convert exists and is easily accessible to users
  • In addition to search engine driven content, there should be content to help educate users about the product or service.
Content formatting
  • Is the content formatted well and easy to read quickly?
  • Are H tags used?
  • Are images used?
  • Is the text broken down into easy to read paragraphs?
Good headlines on blog posts
  • Good headlines go a long way. Make sure the headlines are well written and draw users in.
Amount of content versus ads
  • Since the implementation of Panda, the amount of ad-space on a page has become important to evaluate.
  • Make sure there is significant unique content above the fold.
  • If you have more ads than unique content, you are probably going to have a problem.

Duplicate content

There should be one URL for each piece of content
  • Do URLs include parameters or tracking code? This will result in multiple URLs for a piece of content.
  • Does the same content reside on completely different URLs? This is often due to products/content being replicated across different categories.
Pro Tip:
Exclude common parameters, such as those used to designate tracking code, in Google Webmaster Tools. Read more at
Search Engine Land.
Do a search to check for duplicate content
  • Take a content snippet, put it in quotes and search for it.
  • Does the content show up elsewhere on the domain?
  • Has it been scraped? If the content has been scraped, you should file a content removal request with Google.
Sub-domain duplicate content
  • Does the same content exist on different sub-domains?
Check for a secure version of the site
  • Does the content exist on a secure version of the site?
Check other sites owned by the company
  • Is the content replicated on other domains owned by the company?
Check for “print” pages
  • If there are “printer friendly” versions of pages, they may be causing duplicate content.

Accessibility & Indexation

Check the robots.txt

  • Has the entire site, or important content been blocked? Is link equity being orphaned due to pages being blocked via the robots.txt?

Turn off JavaScript, cookies, and CSS

Now change your user agent to Googlebot

PRO Tip:
Use
SEO Browser to do a quick spot check.

Check the SEOmoz PRO Campaign

  • Check for 4xx errors and 5xx errors.

XML sitemaps are listed in the robots.txt file

XML sitemaps are submitted to Google/Bing Webmaster Tools

Check pages for meta robots noindex tag

  • Are pages accidentally being tagged with the meta robots noindex command
  • Are there pages that should have the noindex command applied
  • You can check the site quickly via a crawl tool such as Moz or Screaming Frog

Do goal pages have the noindex command applied?

  • This is important to prevent direct organic visits from showing up as goals in analytics

Site architecture and internal linking

Number of links on a page
Vertical linking structures are in place
  • Homepage links to category pages.
  • Category pages link to sub-category and product pages as appropriate.
  • Product pages link to relevant category pages.
Horizontal linking structures are in place
  • Category pages link to other relevant category pages.
  • Product pages link to other relevant product pages.
Links are in content
  • Does not utilize massive blocks of links stuck in the content to do internal linking.
Footer links
  • Does not use a block of footer links instead of proper navigation.
  • Does not link to landing pages with optimized anchors.
Good internal anchor text
 
Check for broken links
  • Link Checker and Xenu are good tools for this.

Technical issues

Proper use of 301s
  • Are 301s being used for all redirects?
  • If the root is being directed to a landing page, are they using a 301 instead of a 302?
  • Use Live HTTP Headers Firefox plugin to check 301s.
“Bad” redirects are avoided
  • These include 302s, 307s, meta refresh, and JavaScript redirects as they pass little to no value.
  • These redirects can easily be identified with a tool like Screaming Frog.
Redirects point directly to the final URL and do not leverage redirect chains
  • Redirect chains significantly diminish the amount of link equity associated with the final URL.
  • Google has said that they will stop following a redirect chain after several redirects.
Use of JavaScript
  • Is content being served in JavaScript?
  • Are links being served in JavaScript? Is this to do PR sculpting or is it accidental?
Use of iFrames
  • Is content being pulled in via iFrames?
Use of Flash
  • Is the entire site done in Flash, or is Flash used sparingly in a way that doesn’t hinder crawling?
Check for errors in Google Webmaster Tools
  • Google WMT will give you a good list of technical problems that they are encountering on your site (such as: 4xx and 5xx errors, inaccessible pages in the XML sitemap, and soft 404s)
XML Sitemaps  
  • Are XML sitemaps in place?
  • Are XML sitemaps covering for poor site architecture?
  • Are XML sitemaps structured to show indexation problems?
  • Do the sitemaps follow proper XML protocols
Canonical version of the site established through 301s
 
Canonical version of site is specified in Google Webmaster Tools
 
Rel canonical link tag is properly implemented across the site
Uses absolute URLs instead of relative URLs
  • This can cause a lot of problems if you have a root domain with secure sections.

Site speed


Review page load time for key pages 

Make sure compression is enabled


Enable caching


Optimize your images for the web


Minify your CSS/JS/HTML

Use a good, fast host
  • Consider using a CDN for your images.

Optimize your images for the web

Mobile

Review the mobile experience
  • Is there a mobile site set up?
  • If there is, is it a mobile site, responsive design, or dynamic serving?


Make sure analytics are set up if separate mobile content exists


If dynamic serving is being used, make sure the Vary HTTP header is being used

Review how the mobile experience matches up with the intent of mobile visitors
  • Do your mobile visitors have a different intent than desktop based visitors?
Ensure faulty mobile redirects do not exist
  • If your site redirects mobile visitors away from their intended URL (typically to the homepage), you’re likely going to run into issues impacting your mobile organic performance.
Ensure that the relationship between the mobile site and desktop site is established with proper markup
  • If a mobile site (m.) exists, does the desktop equivalent URL point to the mobile version with rel=”alternate”?
  • Does the mobile version canonical to the desktop version?
  • Official documentation.

International

Review international versions indicated in the URL
  • ex: site.com/uk/ or uk.site.com
Enable country based targeting in webmaster tools
  • If the site is targeted to one specific country, is this specified in webmaster tools? 
  • If the site has international sections, are they targeted in webmaster tools?
Implement hreflang / rel alternate if relevant
If there are multiple versions of a site in the same language (such as /us/ and /uk/, both in English), update the copy been updated so that they are both unique
 

Make sure the currency reflects the country targeted
 
Ensure the URL structure is in the native language 
  • Try to avoid having all URLs in the default language

Analytics

Analytics tracking code is on every page
  • You can check this using the “custom” filter in a Screaming Frog Crawl or by looking for self referrals.
  • Are there pages that should be blocked?
There is only one instance of a GA property on a page
  • Having the same Google Analytics property will create problems with pageview-related metrics such as inflating page views and pages per visit and reducing the bounce rate.
  • It is OK to have multiple GA properties listed, this won’t cause a problem.
Analytics is properly tracking and capturing internal searches
 

Demographics tracking is set up

Adwords and Adsense are properly linked if you are using these platforms
Internal IP addresses are excluded
UTM Campaign Parameters are used for other marketing efforts
Meta refresh and JavaScript redirects are avoided
  • These can artificially lower bounce rates.
Event tracking is set up for key user interactions

This audit covers the main technical elements of a site and should help you uncover any issues that are holding a site back. As with any project, the deliverable is critical. I’ve found focusing on the solution and impact (business case) is the best approach for site audit reports. While it is important to outline the problems, too much detail here can take away from the recommendations. If you’re looking for more resources on site audits, I recommend the following:

Helpful tools for doing a site audit:

Annie Cushing’s Site Audit
Web Developer Toolbar
User Agent Add-on
Firebug
Link Checker
SEObook Toolbar
MozBar (Moz’s SEO toolbar)
Xenu
Screaming Frog
Your own scraper
Inflow’s technical mobile best practices

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

Should you Track Competitors by Subdomain, Root Domain (or Home Page)?

Gabrielle Benedetti asked a brilliant question the other day: Why is Gabriele Asking? When using Majestic, there is a drop down (or if you have personalized your settings, radio buttons) after you hit the search button: This drop down does not appear by default on the home page, as we interpret what we THINK you…

The post Should you Track Competitors by Subdomain, Root Domain (or Home Page)? appeared first on Majestic Blog.

[ccw-atrib-link]