Stop Ghost Spam in Google Analytics with One Filter

Posted by CarloSeo

The spam in Google Analytics (GA) is becoming a serious issue. Due to a deluge of referral spam from social buttons, adult sites, and many, many other sources, people are starting to become overwhelmed by all the filters they are setting up to manage the useless data they are receiving.

The good news is, there is no need to panic. In this post, I’m going to focus on the most common mistakes people make when fighting spam in GA, and explain an efficient way to prevent it.

But first, let’s make sure we understand how spam works. A couple of months ago, Jared Gardner wrote an excellent article explaining what referral spam is, including its intended purpose. He also pointed out some great examples of referral spam.

Types of spam

The spam in Google Analytics can be categorized by two types: ghosts and crawlers.

Ghosts

The vast majority of spam is this type. They are called ghosts because they never access your site. It is important to keep this in mind, as it’s key to creating a more efficient solution for managing spam.

As unusual as it sounds, this type of spam doesn’t have any interaction with your site at all. You may wonder how that is possible since one of the main purposes of GA is to track visits to our sites.

They do it by using the Measurement Protocol, which allows people to send data directly to Google Analytics’ servers. Using this method, and probably randomly generated tracking codes (UA-XXXXX-1) as well, the spammers leave a “visit” with fake data, without even knowing who they are hitting.

Crawlers

This type of spam, the opposite to ghost spam, does access your site. As the name implies, these spam bots crawl your pages, ignoring rules like those found in robots.txt that are supposed to stop them from reading your site. When they exit your site, they leave a record on your reports that appears similar to a legitimate visit.

Crawlers are harder to identify because they know their targets and use real data. But it is also true that new ones seldom appear. So if you detect a referral in your analytics that looks suspicious, researching it on Google or checking it against this list might help you answer the question of whether or not it is spammy.

Most common mistakes made when dealing with spam in GA

I’ve been following this issue closely for the last few months. According to the comments people have made on my articles and conversations I’ve found in discussion forums, there are primarily three mistakes people make when dealing with spam in Google Analytics.

Mistake #1. Blocking ghost spam from the .htaccess file

One of the biggest mistakes people make is trying to block Ghost Spam from the .htaccess file.

For those who are not familiar with this file, one of its main functions is to allow/block access to your site. Now we know that ghosts never reach your site, so adding them here won’t have any effect and will only add useless lines to your .htaccess file.

Ghost spam usually shows up for a few days and then disappears. As a result, sometimes people think that they successfully blocked it from here when really it’s just a coincidence of timing.

Then when the spammers later return, they get worried because the solution is not working anymore, and they think the spammer somehow bypassed the barriers they set up.

The truth is, the .htaccess file can only effectively block crawlers such as buttons-for-website.com and a few others since these access your site. Most of the spam can’t be blocked using this method, so there is no other option than using filters to exclude them.

Mistake #2. Using the referral exclusion list to stop spam

Another error is trying to use the referral exclusion list to stop the spam. The name may confuse you, but this list is not intended to exclude referrals in the way we want to for the spam. It has other purposes.

For example, when a customer buys something, sometimes they get redirected to a third-party page for payment. After making a payment, they’re redirected back to you website, and GA records that as a new referral. It is appropriate to use referral exclusion list to prevent this from happening.

If you try to use the referral exclusion list to manage spam, however, the referral part will be stripped since there is no preexisting record. As a result, a direct visit will be recorded, and you will have a bigger problem than the one you started with since. You will still have spam, and direct visits are harder to track.

Mistake #3. Worrying that bounce rate changes will affect rankings

When people see that the bounce rate changes drastically because of the spam, they start worrying about the impact that it will have on their rankings in the SERPs.

bounce.png

This is another mistake commonly made. With or without spam, Google doesn’t take into consideration Google Analytics metrics as a ranking factor. Here is an explanation about this from Matt Cutts, the former head of Google’s web spam team.

And if you think about it, Cutts’ explanation makes sense; because although many people have GA, not everyone uses it.

Assuming your site has been hacked

Another common concern when people see strange landing pages coming from spam on their reports is that they have been hacked.

landing page

The page that the spam shows on the reports doesn’t exist, and if you try to open it, you will get a 404 page. Your site hasn’t been compromised.

But you have to make sure the page doesn’t exist. Because there are cases (not spam) where some sites have a security breach and get injected with pages full of bad keywords to defame the website.

What should you worry about?

Now that we’ve discarded security issues and their effects on rankings, the only thing left to worry about is your data. The fake trail that the spam leaves behind pollutes your reports.

It might have greater or lesser impact depending on your site traffic, but everyone is susceptible to the spam.

Small and midsize sites are the most easily impacted – not only because a big part of their traffic can be spam, but also because usually these sites are self-managed and sometimes don’t have the support of an analyst or a webmaster.

Big sites with a lot of traffic can also be impacted by spam, and although the impact can be insignificant, invalid traffic means inaccurate reports no matter the size of the website. As an analyst, you should be able to explain what’s going on in even in the most granular reports.

You only need one filter to deal with ghost spam

Usually it is recommended to add the referral to an exclusion filter after it is spotted. Although this is useful for a quick action against the spam, it has three big disadvantages.

  • Making filters every week for every new spam detected is tedious and time-consuming, especially if you manage many sites. Plus, by the time you apply the filter, and it starts working, you already have some affected data.
  • Some of the spammers use direct visits along with the referrals.
  • These direct hits won’t be stopped by the filter so even if you are excluding the referral you will sill be receiving invalid traffic, which explains why some people have seen an unusual spike in direct traffic.

Luckily, there is a good way to prevent all these problems. Most of the spam (ghost) works by hitting GA’s random tracking-IDs, meaning the offender doesn’t really know who is the target, and for that reason either the hostname is not set or it uses a fake one. (See report below)

Ghost-Spam.png

You can see that they use some weird names or don’t even bother to set one. Although there are some known names in the list, these can be easily added by the spammer.

On the other hand, valid traffic will always use a real hostname. In most of the cases, this will be the domain. But it also can also result from paid services, translation services, or any other place where you’ve inserted GA tracking code.

Valid-Referral.png

Based on this, we can make a filter that will include only hits that use real hostnames. This will automatically exclude all hits from ghost spam, whether it shows up as a referral, keyword, or pageview; or even as a direct visit.

To create this filter, you will need to find the report of hostnames. Here’s how:

  1. Go to the Reporting tab in GA
  2. Click on Audience in the lefthand panel
  3. Expand Technology and select Network
  4. At the top of the report, click on Hostname

Valid-list

You will see a list of all hostnames, including the ones that the spam uses. Make a list of all the valid hostnames you find, as follows:

  • yourmaindomain.com
  • blog.yourmaindomain.com
  • es.yourmaindomain.com
  • payingservice.com
  • translatetool.com
  • anotheruseddomain.com

For small to medium sites, this list of hostnames will likely consist of the main domain and a couple of subdomains. After you are sure you got all of them, create a regular expression similar to this one:

yourmaindomain\.com|anotheruseddomain\.com|payingservice\.com|translatetool\.com

You don’t need to put all of your subdomains in the regular expression. The main domain will match all of them. If you don’t have a view set up without filters, create one now.

Then create a Custom Filter.

Make sure you select INCLUDE, then select “Hostname” on the filter field, and copy your expression into the Filter Pattern box.

filter

You might want to verify the filter before saving to check that everything is okay. Once you’re ready, set it to save, and apply the filter to all the views you want (except the view without filters).

This single filter will get rid of future occurrences of ghost spam that use invalid hostnames, and it doesn’t require much maintenance. But it’s important that every time you add your tracking code to any service, you add it to the end of the filter.

Now you should only need to take care of the crawler spam. Since crawlers access your site, you can block them by adding these lines to the .htaccess file:

## STOP REFERRER SPAM 
RewriteCond %{HTTP_REFERER} semalt\.com [NC,OR] 
RewriteCond %{HTTP_REFERER} buttons-for-website\.com [NC] 
RewriteRule .* - [F]

It is important to note that this file is very sensitive, and misplacing a single character it it can bring down your entire site. Therefore, make sure you create a backup copy of your .htaccess file prior to editing it.

If you don’t feel comfortable messing around with your .htaccess file, you can alternatively make an expression with all the crawlers, then and add it to an exclude filter by Campaign Source.

Implement these combined solutions, and you will worry much less about spam contaminating your analytics data. This will have the added benefit of freeing up more time for you to spend actually analyze your valid data.

After stopping spam, you can also get clean reports from the historical data by using the same expressions in an Advance Segment to exclude all the spam.

Bonus resources to help you manage spam

If you still need more information to help you understand and deal with the spam on your GA reports, you can read my main article on the subject here: http://www.ohow.co/what-is-referrer-spam-how-stop-it-guide/.

Additional information on how to stop spam can be found at these URLs:

In closing, I am eager to hear your ideas on this serious issue. Please share them in the comments below.

(Editor’s Note: All images featured in this post were created by the author.)

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

​​Measure Your Mobile Rankings and Search Visibility in Moz Analytics

Posted by jon.white

We have launched a couple of new things in Moz Pro that we are excited to share with you all: Mobile Rankings and a Search Visibility score. If you want, you can jump right in by heading to a campaign and adding a mobile engine, or keep reading for more details!

Track your mobile vs. desktop rankings in Moz Analytics

Mobilegeddon came and went with slightly less fanfare than expected, somewhat due to the vast ‘Mobile Friendly’ updates we all did at super short notice (nice work everyone!). Nevertheless, mobile rankings visibility is now firmly on everyone’s radar, and will only become more important over time.

Now you can track your campaigns’ mobile rankings for all of the same keywords and locations you are tracking on desktop.

For this campaign my mobile visibility is almost 20% lower than my desktop visibility and falling;
I can drill down to find out why

Clicking on this will take you into a new Engines tab within your Keyword Rankings page where you can find a more detailed version of this chart as well as a tabular view by keyword for both desktop and mobile. Here you can also filter by label and location.

Here I can see Search Visibility across engines including mobile;
in this case, for my branded keywords.

We have given an extra engine to all campaigns

We’ve given customers an extra engine for each campaign, increasing the number from 3 to 4. Use the extra slot to add the mobile engine and unlock your mobile data!

We will begin to track mobile rankings within 24 hours of adding to a campaign. Once you are set up, you will notice a new chart on your dashboard showing visibility for Desktop vs. Mobile Search Visibility.

Measure your Search Visibility score vs. competitors

The overall Search Visibility for my campaign

Along with this change we have also added a Search Visibility score to your rankings data. Use your visibility score to track and report on your overall campaign ranking performance, compare to your competitors, and look for any large shifts that might indicate penalties or algorithm changes. For a deeper drill-down into your data you can also segment your visibility score by keyword labels or locations. Visit the rankings summary page on any campaign to get started.

How is Search Visibility calculated?

Good question!

The Search Visibility score is the percentage of clicks we estimate you receive based on your rankings positions, across all of your keywords.

We take each ranking position for each keyword, multiply by an estimated click-thru-rate, and then take the average of all of your keywords. You can think of it as the percentage of your SERPs that you own. The score is expressed as a percentage, though scores of 100% would be almost impossible unless you are tracking keywords using the “site:” modifier. It is probably more useful to measure yourself vs. your competitors rather than focus on the actual score, but, as a rule of thumb, mid-40s is probably the realistic maximum for non-branded keywords.

Jeremy, our Moz Analytics TPM, came up with this metaphor:

Think of the SERPs for your keywords as villages. Each position on the SERP is a plot of land in SERP-village. The Search Visibility score is the average amount of plots you own in each SERP-village. Prime real estate plots (i.e., better ranking positions, like #1) are worth more. A complete monopoly of real estate in SERP-village would equate to a score of 100%. The Search Visibility score equates to how much total land you own in all SERP-villages.

Some neat ways to use this feature

  • Label and group your keywords, particularly when you add them – As visibility score is an average of all of your keywords, when you add or remove keywords from your campaign you will likely see fluctuations in the score that are unrelated to performance. Solve this by getting in the habit of labeling keywords when you add them. Then segment your data by these labels to track performance of specific keyword groups over time.
  • See how location affects your mobile rankings – Using the Engines tab in Keyword Rankings, use the filters to select just local keywords. Look for big differences between Mobile and Desktop where Google might be assuming local intent for mobile searches but not for desktop. Check out how your competitors perform for these keywords. Can you use this data?

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

From Editorial Calendars to SEO: Setting Yourself Up to Create Fabulous Content

Posted by Isla_McKetta

Quick note: This article is meant to apply to teams of all sizes, from the sole proprietor who spends all night writing their copy (because they’re doing business during the day) to the copy team who occupies an entire floor and produces thousands of pieces of content per week. So if you run into a section that you feel requires more resources than you can devote just now, that’s okay. Bookmark it and revisit when you can, or scale the step down to a more appropriate size for your team. We believe all the information here is important, but that does not mean you have to do everything right now.

If you thought ideation was fun, get ready for content creation. Sure, we’ve all written some things before, but the creation phase of content marketing is where you get to watch that beloved idea start to take shape.

Before you start creating, though, you want to get (at least a little) organized, and an editorial calendar is the perfect first step.

Editorial calendars

Creativity and organization are not mutually exclusive. In fact, they can feed each other. A solid schedule gives you and your writers the time and space to be wild and creative. If you’re just starting out, this document may be sparse, but it’s no less important. Starting early with your editorial calendar also saves you from creating content willy-nilly and then finding out months later that no one ever finished that pesky (but crucial) “About” page.

There’s no wrong way to set up your editorial calendar, as long as it’s meeting your needs. Remember that an editorial calendar is a living document, and it will need to change as a hot topic comes up or an author drops out.

There are a lot of different types of documents that pass for editorial calendars. You get to pick the one that’s right for your team. The simplest version is a straight-up calendar with post titles written out on each day. You could even use a wall calendar and a Sharpie.

Monday Tuesday Wednesday Thursday Friday
Title
The Five Colors of Oscar Fashion 12 Fabrics We’re Watching for Fall Is Charmeuse the New Corduroy? Hot Right Now: Matching Your Handbag to Your Hatpin Tea-length and Other Fab Vocab You Need to Know
Author Ellie James Marta Laila Alex

Teams who are balancing content for different brands at agencies or other more complex content environments will want to add categories, author information, content type, social promo, and more to their calendars.

Truly complex editorial calendars are more like hybrid content creation/editorial calendars, where each of the steps to create and publish the content are indicated and someone has planned for how long all of that takes. These can be very helpful if the content you’re responsible for crosses a lot of teams and can take a long time to complete. It doesn’t matter if you’re using Excel or a Google Doc, as long as the people who need the calendar can easily access it. Gantt charts can be excellent for this. Here’s a favorite template for creating a Gantt chart in Google Docs (and they only get more sophisticated).

Complex calendars can encompass everything from ideation through writing, legal review, and publishing. You might even add content localization if your empire spans more than one continent to make sure you have the currency, date formatting, and even slang right.

Content governance

Governance outlines who is taking responsibility for your content. Who evaluates your content performance? What about freshness? Who decides to update (or kill) an older post? Who designs and optimizes workflows for your team or chooses and manages your CMS?

All these individual concerns fall into two overarching components to governance: daily maintenance and overall strategy. In the long run it helps if one person has oversight of the whole process, but the smaller steps can easily be split among many team members. Read this to take your governance to the next level.

Finding authors

The scale of your writing enterprise doesn’t have to be limited to the number of authors you have on your team. It’s also important to consider the possibility of working with freelancers and guest authors. Here’s a look at the pros and cons of outsourced versus in-house talent.

In-house authors

Guest authors and freelancers

Responsible to

You

Themselves

Paid by

You (as part of their salary)

You (on a per-piece basis)

Subject matter expertise

Broad but shallow

Deep but narrow

Capacity for extra work

As you wish

Show me the Benjamins

Turnaround time

On a dime

Varies

Communication investment

Less

More

Devoted audience

Smaller

Potentially huge

From that table, it might look like in-house authors have a lot more advantages. That’s somewhat true, but do not underestimate the value of occasionally working with a true industry expert who has name recognition and a huge following. Whichever route you take (and there are plenty of hybrid options), it’s always okay to ask that the writers you are working with be professional about communication, payment, and deadlines. In some industries, guest writers will write for links. Consider yourself lucky if that’s true. Remember, though, that the final paycheck can be great leverage for getting a writer to do exactly what you need them to (such as making their deadlines).

Tools to help with content creation

So those are some things you need to have in place before you create content. Now’s the fun part: getting started. One of the beautiful things about the Internet is that new and exciting tools crop up every day to help make our jobs easier and more efficient. Here are a few of our favorites.

Calendars

You can always use Excel or a Google Doc to set up your editorial calendar, but we really like Trello for the ability to gather a lot of information in one card and then drag and drop it into place. Once there are actual dates attached to your content, you might be happier with something like a Google Calendar.

Ideation and research

If you need a quick fix for ideation, turn your keywords into wacky ideas with Portent’s Title Maker. You probably won’t want to write to the exact title you’re given (although “True Facts about Justin Bieber’s Love of Pickles” does sound pretty fascinating…), but it’s a good way to get loose and look at your topic from a new angle.

Once you’ve got that idea solidified, find out what your audience thinks about it by gathering information with Survey Monkey or your favorite survey tool. Or, use Storify to listen to what people are saying about your topic across a wide variety of platforms. You can also use Storify to save those references and turn them into a piece of content or an illustration for one. Don’t forget that a simple social ask can also do wonders.

Format

Content doesn’t have to be all about the words. Screencasts, Google+ Hangouts, and presentations are all interesting ways to approach content. Remember that not everyone’s a reader. Some of your audience will be more interested in visual or interactive content. Make something for everyone.

Illustration

Don’t forget to make your content pretty. It’s not that hard to find free stock images online (just make sure you aren’t violating someone’s copyright). We like Morgue File, Free Images, and Flickr’s Creative Commons. If you aren’t into stock images and don’t have access to in-house graphic design, it’s still relatively easy to add images to your content. Pull a screenshot with Skitch or dress up an existing image with Pixlr. You can also use something like Canva to create custom graphics.

Don’t stop with static graphics, though. There are so many tools out there to help you create gifs, quizzes and polls, maps, and even interactive timelines. Dream it, then search for it. Chances are whatever you’re thinking of is doable.

Quality, not quantity

Mediocre content will hurt your cause

Less is more. That’s not an excuse to pare your blog down to one post per month (check out our publishing cadence experiment), but it is an important reminder that if you’re writing “How to Properly Install a Toilet Seat” two days after publishing “Toilet Seat Installation for Dummies,” you might want to rethink your strategy.

The thing is, and I’m going to use another cliché here to drive home the point, you never get a second chance to make a first impression. Potential customers are roving the Internet right now looking for exactly what you’re selling. And if what they find is an only somewhat informative article stuffed with keywords and awful spelling and grammar mistakes… well, you don’t want that. Oh, and search engines think it’s spammy too…

A word about copyright

We’re not copyright lawyers, so we can’t give you the ins and outs on all the technicalities. What we can tell you (and you already know this) is that it’s not okay to steal someone else’s work. You wouldn’t want them to do it to you. This includes images. So whenever you can, make your own images or find images that you can either purchase the rights to (stock imagery) or license under Creative Commons.

It’s usually okay to quote short portions of text, as long as you attribute the original source (and a link is nice). In general, titles and ideas can’t be copyrighted (though they might be trademarked or patented). When in doubt, asking for permission is smart.

That said, part of the fun of the Internet is the remixing culture which includes using things like memes and gifs. Just know that if you go that route, there is a certain amount of risk involved.

Editing

Your content needs to go through at least one editing cycle by someone other than the original author. There are two types of editing, developmental (which looks at the underlying structure of a piece that happens earlier in the writing cycle) and copy editing (which makes sure all the words are there and spelled right in the final draft).

If you have a very small team or are in a rush (and are working with writers that have some skill), you can often skip the developmental editing phase. But know that an investment in that close read of an early draft is often beneficial to the piece and to the writer’s overall growth.

Many content teams peer-edit work, which can be great. Other organizations prefer to run their work by a dedicated editor. There’s no wrong answer, as long as the work gets edited.

Ensuring proper basic SEO

The good news is that search engines are doing their best to get closer and closer to understanding and processing natural language. So good writing (including the natural use of synonyms rather than repeating those keywords over and over and…) will take you a long way towards SEO mastery.

For that reason (and because it’s easy to get trapped in keyword thinking and veer into keyword stuffing), it’s often nice to think of your SEO check as a further edit of the post rather than something you should think about as you’re writing.

But there are still a few things you can do to help cover those SEO bets. Once you have that draft, do a pass for SEO to make sure you’ve covered the following:

  • Use your keyword in your title
  • Use your keyword (or long-tail keyword phrase) in an H2
  • Make sure the keyword appears at least once (though not more than four times, especially if it’s a phrase) in the body of the post
  • Use image alt text (including the keyword when appropriate)

Finding time to write when you don’t have any

Writing (assuming you’re the one doing the writing) can require a lot of energy—especially if you want to do it well. The best way to find time to write is to break each project down into little tasks. For example, writing a blog post actually breaks down into these steps (though not always in this order):

  • Research
  • Outline
  • Fill in outline
  • Rewrite and finish post
  • Write headline
  • SEO check
  • Final edit
  • Select hero image (optional)

So if you only have random chunks of time, set aside 15-30 minutes one day (when your research is complete) to write a really great outline. Then find an hour the next to fill that outline in. After an additional hour the following day, (unless you’re dealing with a research-heavy post) you should have a solid draft by the end of day three.

The magic of working this way is that you engage your brain and then give it time to work in the background while you accomplish other tasks. Hemingway used to stop mid-sentence at the end of his writing days for the same reason.

Once you have that draft nailed, the rest of the steps are relatively easy (even the headline, which often takes longer to write than any other sentence, is easier after you’ve immersed yourself in the post over a few days).

Working with design/development

Every designer and developer is a little different, so we can’t give you any blanket cure-alls for inter-departmental workarounds (aka “smashing silos”). But here are some suggestions to help you convey your vision while capitalizing on the expertise of your coworkers to make your content truly excellent.

Ask for feedback

From the initial brainstorm to general questions about how to work together, asking your team members what they think and prefer can go a long way. Communicate all the details you have (especially the unspoken expectations) and then listen.

If your designer tells you up front that your color scheme is years out of date, you’re saving time. And if your developer tells you that the interactive version of that timeline will require four times the resources, you have the info you need to fight for more budget (or reassess the project).

Check in

Things change in the design and development process. If you have interim check-ins already set up with everyone who’s working on the project, you’ll avoid the potential for nasty surprises at the end. Like finding out that no one has experience working with that hot new coding language you just read about and they’re trying to do a workaround that isn’t working.

Proofread

Your job isn’t done when you hand over the copy to your designer or developer. Not only might they need help rewriting some of your text so that it fits in certain areas, they will also need you to proofread the final version. Accidents happen in the copy-and-paste process and there’s nothing sadder than a really beautiful (and expensive) piece of content that wraps up with a typo:

Know when to fight for an idea

Conflict isn’t fun, but sometimes it’s necessary. The more people involved in your content, the more watered down the original idea can get and the more roadblocks and conflicting ideas you’ll run into. Some of that is very useful. But sometimes you’ll get pulled off track. Always remember who owns the final product (this may not be you) and be ready to stand up for the idea if it’s starting to get off track.

We’re confident this list will set you on the right path to creating some really awesome content, but is there more you’d like to know? Ask us your questions in the comments.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

The Inbound Marketing Economy

Posted by KelseyLibert

When it comes to job availability and security, the future looks bright for inbound marketers.

The Bureau of Labor Statistics (BLS) projects that employment for marketing managers will grow by 13% between 2012 and 2022. Job security for marketing managers also looks positive according to the BLS, which cites that marketing employees are less likely to be laid off since marketing drives revenue for most businesses.

While the BLS provides growth estimates for managerial-level marketing roles, these projections don’t give much insight into the growth of digital marketing, specifically the disciplines within digital marketing. As we know, “marketing” can refer to a variety of different specializations and methodologies. Since digital marketing is still relatively new compared to other fields, there is not much comprehensive research on job growth and trends in our industry.

To gain a better understanding of the current state of digital marketing careers, Fractl teamed up with Moz to identify which skills and roles are the most in demand and which states have the greatest concentration of jobs.

Methodology

We analyzed 75,315 job listings posted on Indeed.com during June 2015 based on data gathered from job ads containing the following terms:

  • “content marketing” or “content strategy”
  • “SEO” or “search engine marketing”
  • “social media marketing” or “social media management”
  • “inbound marketing” or “digital marketing”
  • “PPC” (pay-per-click)
  • “Google Analytics”

We chose the above keywords based on their likelihood to return results that were marketing-focused roles (for example, just searching for “social media” may return a lot of jobs that are not primarily marketing focused, such as customer service). The occurrence of each of these terms in job listings was quantified and segmented by state. We then combined the job listing data with U.S. Census Bureau population estimates to calculate the jobs per capita for each keyword, giving us the states with the greatest concentration of jobs for a given search query.

Using the same data, we identified which job titles appeared most frequently. We used existing data from Indeed to determine job trends and average salaries. LinkedIn search results were also used to identify keyword growth in user profiles.

Marketing skills are in high demand, but talent is hard to find

As the marketing industry continues to evolve due to emerging technology and marketing platforms, marketers are expected to pick up new skills and broaden their knowledge more quickly than ever before. Many believe this rapid rate of change has caused a marketing skills gap, making it difficult to find candidates with the technical, creative, and business proficiencies needed to succeed in digital marketing.

The ability to combine analytical thinking with creative execution is highly desirable and necessary in today’s marketing landscape. According to an article in The Guardian, “Companies will increasingly look for rounded individuals who can combine analytical rigor with the ability to apply this knowledge in a practical and creative context.” Being both detail-oriented and a big picture thinker is also a sought-after combination of attributes. A report by The Economist and Marketo found that “CMOs want people with the ability to grasp and manage the details (in data, technology, and marketing operations) combined with a view of the strategic big picture.”

But well-rounded marketers are hard to come by. In a study conducted by Bullhorn, 64% of recruiters reported a shortage of skilled candidates for available marketing roles. Wanted Analytics recently found that one of the biggest national talent shortages is for marketing manager roles, with only two available candidates per job opening.

Increase in marketers listing skills in content marketing, inbound marketing, and social media on LinkedIn profiles

While recruiter frustrations may indicate a shallow talent pool, LinkedIn tells a different story—the number of U.S.-based marketers who identify themselves as having digital marketing skills is on the rise. Using data tracked by Rand and LinkedIn, we found the following increases of marketing keywords within user profiles.

growth of marketing keywords in linkedin profiles

The number of profiles containing “content marketing” has seen the largest growth, with a 168% increase since 2013. “Social media” has also seen significant growth with a 137% increase. “Social media” appears on a significantly higher volume of profiles than the other keywords, with more than 2.2 million profiles containing some mention of social media. Although “SEO” has not seen as much growth as the other keywords, it still has the second-highest volume with it appearing in 630,717 profiles.

Why is there a growing number of people self-identifying as having the marketing skills recruiters want, yet recruiters think there is a lack of talent?

While there may be a lot of specialists out there, perhaps recruiters are struggling to fill marketing roles due to a lack of generalists or even a lack of specialists with surface-level knowledge of other areas of digital marketing (also known as a T-shaped marketer).

Popular job listings show a need for marketers to diversify their skill set

The data we gathered from LinkedIn confirm this, as the 20 most common digital marketing-related job titles being advertised call for a broad mix of skills.

20 most common marketing job titles

It’s no wonder that marketing manager roles are hard to fill, considering the job ads are looking for proficiency in a wide range of marketing disciplines including social media marketing, SEO, PPC, content marketing, Google Analytics, and digital marketing. Even job descriptions for specialist roles tend to call for skills in other disciplines. A particular role such as SEO Specialist may call for several skills other than SEO, such as PPC, content marketing, and Google Analytics.

Taking a more granular look at job titles, the chart below shows the five most common titles for each search query. One might expect mostly specialist roles to appear here, but there is a high occurrence of generalist positions, such as Digital Marketing Manager and Marketing Manager.

5 most common job titles by search query

Only one job title containing “SEO” cracked the top five. This indicates that SEO knowledge is a desirable skill within other roles, such as general digital marketing and development.

Recruiter was the third most common job title among job listings containing social media keywords, which suggests a need for social media skills in non-marketing roles.

Similar to what we saw with SEO job titles, only one job title specific to PPC (Paid Search Specialist) made it into the top job titles. PPC skills are becoming necessary for more general marketing roles, such as Marketing Manager and Digital Marketing Specialist.

Across all search queries, the most common jobs advertised call for a broad mix of skills. This tells us hiring managers are on the hunt for well-rounded candidates with a diverse range of marketing skills, as opposed to candidates with expertise in one area.

Marketers who cultivate diverse skill sets are better poised to gain an advantage over other job seekers, excel in their job role, and accelerate career growth. Jason Miller says it best in his piece about the new breed hybrid marketer:

future of marketing quote linkedin

Inbound job demand and growth: Most-wanted skills and fastest-growing jobs

Using data from Indeed, we identified which inbound skills have the highest demand and which jobs are seeing the most growth. Social media keywords claim the largest volume of results out of the terms we searched for during June 2015.

number of marketing job listings by keyword

“Social media marketing” or “social media management” appeared the most frequently in the job postings we analyzed, with 46.7% containing these keywords. “PPC” returned the smallest number of results, with only 3.8% of listings containing this term.

Perhaps this is due to social media becoming a more necessary skill across many industries and not only a necessity for marketers (for example, social media’s role in customer service and recruitment). On the other hand, job roles calling for PPC or SEO skills are most likely marketing-focused. The prevalence of social media jobs also may indicate that social media has gained wide acceptance as a necessary part of a marketing strategy. Additionally, social media skills are less valuable compared to other marketing skills, making it cheaper to hire for these positions (we will explore this further in the average salaries section below).

Our search results also included a high volume of jobs containing “digital marketing” and “SEO” keywords, which made up 19.5% and 15.5% respectively. At 5.8%, “content marketing” had the lowest search volume after “PPC.”

Digital marketing, social media, and content marketing experienced the most job growth

While the number of job listings tells us which skills are most in demand today, looking at which jobs are seeing the most growth can give insight into shifting demands.

digital marketing growth on  indeed.com

Digital marketing job listings have seen substantial growth since 2009, when it accounted for less than 0.1% of Indeed.com search results. In January 2015, this number had climbed to nearly 0.3%.

social media job growth on indeed.com

While social media marketing jobs have seen some uneven growth, as of January 2015 more than 0.1% of all job listings on Indeed.com contained the term “social media marketing” or “social media management.” This shows a significant upward trend considering this number was around 0.05% for most of 2014. It’s also worth noting that “social media” is currently ranked No. 10 on Indeed’s list of top job trends.

content marketing job growth on indeed.com

Despite its growth from 0.02% to nearly 0.09% of search volume in the last four years, “content marketing” does not make up a large volume of job postings compared to “digital marketing” or “social media.” In fact, “SEO” has seen a decrease in growth but still constitutes a higher percentage of job listings than content marketing.

SEO, PPC, and Google Analytics job growth has slowed down

On the other hand, search volume on Indeed has either decreased or plateaued for “SEO,” “PPC,” and “Google Analytics.”

seo job growth on indeed.com

As we see in the graph, the volume of “SEO job” listings peaked between 2011 and 2012. This is also around the time content marketing began gaining popularity, thanks to the Panda and Penguin updates. The decrease may be explained by companies moving their marketing budgets away from SEO and toward content or social media positions. However, “SEO” still has a significant amount of job listings, with it appearing in more than 0.2% of job listings on Indeed as of 2015.

ppc job growth on indeed.com

“PPC” has seen the most staggered growth among all the search terms we analyzed, with its peak of nearly 0.1% happening between 2012 and 2013. As of January of this year, search volume was below 0.05% for “PPC.”

google analytics job growth on indeed.com

Despite a lack of growth, the need for this skill remains steady. Between 2008 and 2009, “Google Analytics” job ads saw a huge spike on Indeed. Since then, the search volume has tapered off and plateaued through January 2015.

Most valuable skills are SEO, digital marketing, and Google Analytics

So we know the number of social media, digital marketing, and content marketing jobs are on the rise. But which skills are worth the most? We looked at the average salaries based on keywords and estimates from Indeed and salaries listed in job ads.

national average marketing salaries

Job titles containing “SEO” had an average salary of $102,000. Meanwhile, job titles containing “social media marketing” had an average salary of $51,000. Considering such a large percentage of the job listings we analyzed contained “social media” keywords, there is a much larger pool of jobs; therefore, a lot of entry level social media jobs or internships are probably bringing down the average salary.

Job titles containing “Google Analytics” had the second-highest average salary at $82,000, but this should be taken with a grain of salt considering “Google Analytics” will rarely appear as part of a job title. The chart below, which shows average salaries for jobs containing keywords anywhere in the listing as opposed to only in the title, gives a more accurate idea of how much “Google Analytics” job roles earn on average.national salary averages marketing keywords

Looking at the average salaries based on keywords that appeared anywhere within the job listing (job title, job description, etc.) shows a slightly different picture. Based on this, jobs containing “digital marketing” or “inbound marketing” had the highest average salary of $84,000. “SEO” and “Google Analytics” are tied for second with $76,000 as the average salary.

“Social media marketing” takes the bottom spot with an average salary of $57,000. However, notice that there is a higher average salary for jobs that contain “social media” within the job listing as opposed to jobs that contain “social media” within the title. This suggests that social media skills may be more valuable when combined with other responsibilities and skills, whereas a strictly social media job, such as Social Media Manager or Social Media Specialist, does not earn as much.

Massachusetts, New York, and California have the most career opportunities for inbound marketers

Looking for a new job? Maybe it’s time to pack your bags for Boston.

Massachusetts led the U.S. with the most jobs per capita for digital marketing, content marketing, SEO, and Google Analytics. New York took the top spot for social media jobs per capita, while Utah had the highest concentration of PPC jobs. California ranked in the top three for digital marketing, content marketing, social media, and Google Analytics. Illinois appeared in the top 10 for every term and usually ranked within the top five. Most of the states with the highest job concentrations are in the Northeast, West, and East Coast, with a few exceptions such as Illinois and Minnesota.

But you don’t necessarily have to move to a new state to increase the odds of landing an inbound marketing job. Some unexpected states also made the cut, with Connecticut and Vermont ranking within the top 10 for several keywords.

concentration of digital marketing jobs

marketing jobs per capita

Job listings containing “digital marketing” or “inbound marketing” were most prevalent in Massachusetts, New York, Illinois, and California, which is most likely due to these states being home to major cities where marketing agencies and large brands are headquartered or have a presence. You will notice these four states make an appearance in the top 10 for every other search query and usually rank close to the top of the list.

More surprising to find in the top 10 were smaller states such as Connecticut and Vermont. Many major organizations are headquartered in Connecticut, which may be driving the state’s need for digital marketing talent. Vermont’s high-tech industry growth may explain its high concentration of digital marketing jobs.

content marketing job concentration

per capita content marketing jobs

Although content marketing jobs are growing, there are still a low volume overall of available jobs, as shown by the low jobs per capita compared to most of the other search queries. With more than three jobs per capita, Massachusetts and New York topped the list for the highest concentration of job listings containing “content marketing” or “content strategy.” California and Illinois rank in third and fourth with 2.8 and 2.1 jobs per capita respectively.

seo job concentration

seo jobs per capita

Again, Massachusetts and New York took the top spots, each with more than eight SEO jobs per capita. Utah took third place for the highest concentration of SEO jobs. Surprised to see Utah rank in the top 10? Its inclusion on this list and others may be due to its booming tech startup scene, which has earned the metropolitan areas of Salt Lake City, Provo, and Park City the nickname Silicon Slopes.

social media job concentration

social media jobs per capita

Compared to the other keywords, “social media” sees a much higher concentration of jobs. New York dominates the rankings with nearly 24 social media jobs per capita. The other top contenders of California, Massachusetts, and Illinois all have more than 15 social media jobs per capita.

The numbers at the bottom of this list can give you an idea of how prevalent social media jobs were compared to any other keyword we analyzed. Minnesota’s 12.1 jobs per capita, the lowest ranking state in the top 10 for social media, trumps even the highest ranking state for any other keyword (11.5 digital marketing jobs per capita in Massachusetts).

ppc job concentration

ppc jobs per capita

Due to its low overall number of available jobs, “PPC” sees the lowest jobs per capita out of all the search queries. Utah has the highest concentration of jobs with just two PPC jobs per 100,000 residents. It is also the only state in the top 10 to crack two jobs per capita.

google analytics job concentration

google analytics jobs per capita

Regionally, the Northeast and West dominate the rankings, with the exception of Illinois. Massachusetts and New York are tied for the most Google Analytics job postings, each with nearly five jobs per capita. At more than three jobs per 100,000 residents, California, Illinois, and Colorado round out the top five.

Overall, our findings indicate that none of the marketing disciplines we analyzed are dying career choices, but there is a need to become more than a one-trick pony—or else you’ll risk getting passed up for job opportunities. As the marketing industry evolves, there is a greater need for marketers who “wear many hats” and have competencies across different marketing disciplines. Marketers who develop diverse skill sets can gain a competitive advantage in the job market and achieve greater career growth.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

Controlling Search Engine Crawlers for Better Indexation and Rankings – Whiteboard Friday

Posted by randfish

When should you disallow search engines in your robots.txt file, and when should you use meta robots tags in a page header? What about nofollowing links? In today’s Whiteboard Friday, Rand covers these tools and their appropriate use in four situations that SEOs commonly find themselves facing.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

Video transcription

Howdy Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re going to talk about controlling search engine crawlers, blocking bots, sending bots where we want, restricting them from where we don’t want them to go. We’re going to talk a little bit about crawl budget and what you should and shouldn’t have indexed.

As a start, what I want to do is discuss the ways in which we can control robots. Those include the three primary ones: robots.txt, meta robots, and—well, the nofollow tag is a little bit less about controlling bots.

There are a few others that we’re going to discuss as well, including Webmaster Tools (Search Console) and URL status codes. But let’s dive into those first few first.

Robots.txt lives at yoursite.com/robots.txt, it tells crawlers what they should and shouldn’t access, it doesn’t always get respected by Google and Bing. So a lot of folks when you say, “hey, disallow this,” and then you suddenly see those URLs popping up and you’re wondering what’s going on, look—Google and Bing oftentimes think that they just know better. They think that maybe you’ve made a mistake, they think “hey, there’s a lot of links pointing to this content, there’s a lot of people who are visiting and caring about this content, maybe you didn’t intend for us to block it.” The more specific you get about an individual URL, the better they usually are about respecting it. The less specific, meaning the more you use wildcards or say “everything behind this entire big directory,” the worse they are about necessarily believing you.

Meta robots—a little different—that lives in the headers of individual pages, so you can only control a single page with a meta robots tag. That tells the engines whether or not they should keep a page in the index, and whether they should follow the links on that page, and it’s usually a lot more respected, because it’s at an individual-page level; Google and Bing tend to believe you about the meta robots tag.

And then the nofollow tag, that lives on an individual link on a page. It doesn’t tell engines where to crawl or not to crawl. All it’s saying is whether you editorially vouch for a page that is being linked to, and whether you want to pass the PageRank and link equity metrics to that page.

Interesting point about meta robots and robots.txt working together (or not working together so well)—many, many folks in the SEO world do this and then get frustrated.

What if, for example, we take a page like “blogtest.html” on our domain and we say “all user agents, you are not allowed to crawl blogtest.html. Okay—that’s a good way to keep that page away from being crawled, but just because something is not crawled doesn’t necessarily mean it won’t be in the search results.

So then we have our SEO folks go, “you know what, let’s make doubly sure that doesn’t show up in search results; we’ll put in the meta robots tag:”

<meta name="robots" content="noindex, follow">

So, “noindex, follow” tells the search engine crawler they can follow the links on the page, but they shouldn’t index this particular one.

Then, you go and run a search for “blog test” in this case, and everybody on the team’s like “What the heck!? WTF? Why am I seeing this page show up in search results?”

The answer is, you told the engines that they couldn’t crawl the page, so they didn’t. But they are still putting it in the results. They’re actually probably not going to include a meta description; they might have something like “we can’t include a meta description because of this site’s robots.txt file.” The reason it’s showing up is because they can’t see the noindex; all they see is the disallow.

So, if you want something truly removed, unable to be seen in search results, you can’t just disallow a crawler. You have to say meta “noindex” and you have to let them crawl it.

So this creates some complications. Robots.txt can be great if we’re trying to save crawl bandwidth, but it isn’t necessarily ideal for preventing a page from being shown in the search results. I would not recommend, by the way, that you do what we think Twitter recently tried to do, where they tried to canonicalize www and non-www by saying “Google, don’t crawl the www version of twitter.com.” What you should be doing is rel canonical-ing or using a 301.

Meta robots—that can allow crawling and link-following while disallowing indexation, which is great, but it requires crawl budget and you can still conserve indexing.

The nofollow tag, generally speaking, is not particularly useful for controlling bots or conserving indexation.

Webmaster Tools (now Google Search Console) has some special things that allow you to restrict access or remove a result from the search results. For example, if you have 404’d something or if you’ve told them not to crawl something but it’s still showing up in there, you can manually say “don’t do that.” There are a few other crawl protocol things that you can do.

And then URL status codes—these are a valid way to do things, but they’re going to obviously change what’s going on on your pages, too.

If you’re not having a lot of luck using a 404 to remove something, you can use a 410 to permanently remove something from the index. Just be aware that once you use a 410, it can take a long time if you want to get that page re-crawled or re-indexed, and you want to tell the search engines “it’s back!” 410 is permanent removal.

301—permanent redirect, we’ve talked about those here—and 302, temporary redirect.

Now let’s jump into a few specific use cases of “what kinds of content should and shouldn’t I allow engines to crawl and index” in this next version…

[Rand moves at superhuman speed to erase the board and draw part two of this Whiteboard Friday. Seriously, we showed Roger how fast it was, and even he was impressed.]

Four crawling/indexing problems to solve

So we’ve got these four big problems that I want to talk about as they relate to crawling and indexing.

1. Content that isn’t ready yet

The first one here is around, “If I have content of quality I’m still trying to improve—it’s not yet ready for primetime, it’s not ready for Google, maybe I have a bunch of products and I only have the descriptions from the manufacturer and I need people to be able to access them, so I’m rewriting the content and creating unique value on those pages… they’re just not ready yet—what should I do with those?”

My options around crawling and indexing? If I have a large quantity of those—maybe thousands, tens of thousands, hundreds of thousands—I would probably go the robots.txt route. I’d disallow those pages from being crawled, and then eventually as I get (folder by folder) those sets of URLs ready, I can then allow crawling and maybe even submit them to Google via an XML sitemap.

If I’m talking about a small quantity—a few dozen, a few hundred pages—well, I’d probably just use the meta robots noindex, and then I’d pull that noindex off of those pages as they are made ready for Google’s consumption. And then again, I would probably use the XML sitemap and start submitting those once they’re ready.

2. Dealing with duplicate or thin content

What about, “Should I noindex, nofollow, or potentially disallow crawling on largely duplicate URLs or thin content?” I’ve got an example. Let’s say I’m an ecommerce shop, I’m selling this nice Star Wars t-shirt which I think is kind of hilarious, so I’ve got starwarsshirt.html, and it links out to a larger version of an image, and that’s an individual HTML page. It links out to different colors, which change the URL of the page, so I have a gray, blue, and black version. Well, these four pages are really all part of this same one, so I wouldn’t recommend disallowing crawling on these, and I wouldn’t recommend noindexing them. What I would do there is a rel canonical.

Remember, rel canonical is one of those things that can be precluded by disallowing. So, if I were to disallow these from being crawled, Google couldn’t see the rel canonical back, so if someone linked to the blue version instead of the default version, now I potentially don’t get link credit for that. So what I really want to do is use the rel canonical, allow the indexing, and allow it to be crawled. If you really feel like it, you could also put a meta “noindex, follow” on these pages, but I don’t really think that’s necessary, and again that might interfere with the rel canonical.

3. Passing link equity without appearing in search results

Number three: “If I want to pass link equity (or at least crawling) through a set of pages without those pages actually appearing in search results—so maybe I have navigational stuff, ways that humans are going to navigate through my pages, but I don’t need those appearing in search results—what should I use then?”

What I would say here is, you can use the meta robots to say “don’t index the page, but do follow the links that are on that page.” That’s a pretty nice, handy use case for that.

Do NOT, however, disallow those in robots.txt—many, many folks make this mistake. What happens if you disallow crawling on those, Google can’t see the noindex. They don’t know that they can follow it. Granted, as we talked about before, sometimes Google doesn’t obey the robots.txt, but you can’t rely on that behavior. Trust that the disallow in robots.txt will prevent them from crawling. So I would say, the meta robots “noindex, follow” is the way to do this.

4. Search results-type pages

Finally, fourth, “What should I do with search results-type pages?” Google has said many times that they don’t like your search results from your own internal engine appearing in their search results, and so this can be a tricky use case.

Sometimes a search result page—a page that lists many types of results that might come from a database of types of content that you’ve got on your site—could actually be a very good result for a searcher who is looking for a wide variety of content, or who wants to see what you have on offer. Yelp does this: When you say, “I’m looking for restaurants in Seattle, WA,” they’ll give you what is essentially a list of search results, and Google does want those to appear because that page provides a great result. But you should be doing what Yelp does there, and make the most common or popular individual sets of those search results into category-style pages. A page that provides real, unique value, that’s not just a list of search results, that is more of a landing page than a search results page.

However, that being said, if you’ve got a long tail of these, or if you’d say “hey, our internal search engine, that’s really for internal visitors only—it’s not useful to have those pages show up in search results, and we don’t think we need to make the effort to make those into category landing pages.” Then you can use the disallow in robots.txt to prevent those.

Just be cautious here, because I have sometimes seen an over-swinging of the pendulum toward blocking all types of search results, and sometimes that can actually hurt your SEO and your traffic. Sometimes those pages can be really useful to people. So check your analytics, and make sure those aren’t valuable pages that should be served up and turned into landing pages. If you’re sure, then go ahead and disallow all your search results-style pages. You’ll see a lot of sites doing this in their robots.txt file.

That being said, I hope you have some great questions about crawling and indexing, controlling robots, blocking robots, allowing robots, and I’ll try and tackle those in the comments below.

We’ll look forward to seeing you again next week for another edition of Whiteboard Friday. Take care!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

An Open-Source Tool for Checking rel-alternate-hreflang Annotations

Posted by Tom-Anthony

In the Distilled R&D department we have been ramping up the amount of automated monitoring and analysis we do, with an internal system monitoring our client’s sites both directly and via various data sources to ensure they remain healthy and we are alerted to any problems that may arise.

Recently we started work to add in functionality for including the rel-alternate-hreflang annotations in this system. In this blog post I’m going to share an open-source Python library we’ve just started work on for the purpose, which makes it easy to read the hreflang entries from a page and identify errors with them.

If you’re not a Python aficionado then don’t despair, as I have also built a ready-to-go tool for you to use, which will quickly do some checks on the hreflang entries for any URL you specify. 🙂

Google’s Search Console (formerly Webmaster Tools) does have some basic rel-alternate-hreflang checking built in, but it is limited in how you can use it and you are restricted to using it for verified sites.

rel-alternate-hreflang checklist

Before we introduce the code, I wanted to quickly review a list of five easy and common mistakes that we will want to check for when looking at rel-alternate-hreflang annotations:

  • return tag errors – Every alternate language/locale URL of a page should, itself, include a link back to the first page. This makes sense but I’ve seen people make mistakes with it fairly often.
  • indirect / broken links – Links to alternate language/region versions of the page should no go via redirects, and should not link to missing or broken pages.
  • multiple entries – There should never be multiple entries for a single language/region combo.
  • multiple defaults – You should never have more than one x-default entry.
  • conflicting modes – rel-alternate-hreflang entries can be implemented via inline HTML, XML sitemaps, or HTTP headers. For any one set of pages only one implementation mode should be used.

So now imagine that we want to simply automate these checks quickly and simply…

Introducing: polly – the hreflang checker library

polly is the name for the library we have developed to help us solve this problem, and we are releasing it as open source so the SEO community can use it freely to build upon. We only started work on it last week, but we plan to continue developing it, and will also accept contributions to the code from the community, so we expect its feature set to grow rapidly.

If you are not comfortable tinkering with Python, then feel free to skip down to the next section of the post, where there is a tool that is built with polly which you can use right away.

Still here? Ok, great. You can install polly easily via pip:

pip install polly

You can then create a PollyPage() object which will do all our work and store the data simply by instantiating the class with the desired URL:

my_page = PollyPage("http://www.facebook.com/")

You can quickly see the hreflang entries on the page by running:

print my_page.alternate_urls_map

You can list all the hreflang values encountered on a page, and which countries and languages they cover:

print my_page.hreflang_values
print my_page.languages
print my_page.regions

You can also check various aspects of a page, see whether the pages it includes in its rel-alternate-hreflang entries point back, or whether there are entries that do not see retrievable (due to 404 or 500 etc. errors):

print my_page.is_default
print my_page.no_return_tag_pages()
print my_page.non_retrievable_pages()

Get more instructions and grab the code at the polly github page. Hit me up in the comments with any questions.

Free tool: hreflang.ninja

I have put together a very simple tool that uses polly to run some of the checks we highlighted above as being common mistakes with rel-alternate-hreflang, which you can visit right now and start using:

http://hreflang.ninja

Simply enter a URL and hit enter, and you should see something like:

Example output from the ninja!

The tool shows you the rel-alternate-hreflang entries found on the page, the language and region of those entries, the alternate URLs, and any errors identified with the entry. It is perfect for doing quick’n’dirty checks of a URL to identify any errors.

As we add additional functionality to polly we will be updating hreflang.ninja as well, so please tweet me with feature ideas or suggestions.

To-do list!

This is the first release of polly and currently we only handle annotations that are in the HTML of the page, not those in the XML sitemap or HTTP headers. However, we are going to be updating polly (and hreflang.ninja) over the coming weeks, so watch this space! 🙂

Resources

Here are a few links you may find helpful for hreflang:

Got suggestions?

With the increasing number of SEO directives and annotations available, and the ever-changing guidelines around how to deploy them, it is important to automate whatever areas possible. Hopefully polly is helpful to the community in this regard, and we want to here what ideas you have for making these tools more useful – here in the comments or via Twitter.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]

A Vision for Brand Engagement Online, or &quot;The Goal&quot;

Posted by EricEnge

Today’s post focuses on a vision for your online presence. This vision outlines what it takes to be the best, both from an overall reputation and visibility standpoint, as well as an SEO point of view. The reason these are tied together is simple: Your overall online reputation and visibility is a huge factor in your SEO. Period. Let’s start by talking about why.

Core ranking signals

For purposes of this post, let’s define three cornerstone ranking signals that most everyone agrees on:

Links

Links remain a huge factor in overall ranking. Both Cyrus Shepard and Marcus Tober re-confirmed this on the Periodic Table of SEO Ranking Factors session at the SMX Advanced conference in Seattle this past June.

On-page content

On-page content remains a huge factor too, but with some subtleties now thrown in. I wrote about some of this in earlier posts I did on Moz about Term Frequency and Inverse Document Frequency. Suffice it to say that on-page content is about a lot more than pure words on the page, but also includes the supporting pages that you link to.

User engagement with your site

This is not one of the traditional SEO signals from the early days of SEO, but most advanced SEO pros that I know consider it a real factor these days. One of the most popular concepts people talk about is called pogo-sticking, which is illustrated here:

You can learn more about the pogosticking concept by visiting this Whiteboard Friday video by a rookie SEO with a last name of Fishkin.

New, lesser-known signals

OK, so these are the more obvious signals, but now let’s look more broadly at the overall web ecosystem and talk about other types of ranking signals. Be warned that some of these signals may be indirect, but that just doesn’t matter. In fact, my first example below is an indirect factor which I will use to demonstrate why whether a signal is direct or indirect is not an issue at all.

Let me illustrate with an example. Say you spend $1 billion dollars building a huge brand around a product that is massively useful to people. Included in this is a sizable $100 million dollar campaign to support a highly popular charitable foundation, and your employees regularly donate time to help out in schools across your country. In short, the great majority of people love your brand.

Do you think this will impact the way people link to your site? Of course it does. Do you think it will impact how likely people are to be satisified with quality of the pages of your site? Consider this A/B test scenario of 2 pages from different “brands” (for the one on the left, imagine the image of Coca Cola or Pepsi Cola, whichever one you prefer):

Do you think that the huge brand will get a benefit of a doubt on their page that the no-name brand does not even though the pages are identical? Of course they will. Now let’s look at some simpler scenarios that don’t involve a $1 billion investment.

1. Cover major options related to a product or service on “money pages”

Imagine that a user arrives on your auto parts site after searching on the phrase “oil filter” at Google or Bing. Chances are pretty good that they want an oil filter, but here are some other items they may also want:

  • A guide to picking the right filter for their car
  • Oil
  • An oil filter wrench
  • A drainage pan to drain the old oil into

This is just the basics, right? But, you would be surprised with how many sites don’t include links or information on directly related products on their money pages. Providing this type of smart site and page design can have a major impact on user engagement with the money pages of your site.

2. Include other related links on money pages

In the prior item we covered the user’s most directly related needs, but they may have secondary needs as well. Someone who is changing a car’s oil is either a mechanic or a do-it-yourself-er. What else might they need? How about other parts, such as windshield wipers or air filters?

These are other fairly easy maintenance steps for someone who is working on their car to complete. Presence of these supporting products could be one way to improve user engagement with your pages.

3. Offer industry-leading non-commercial content on-site

Publishing world-class content on your site is a great way to produce links to your site. Of course, if you do this on a blog on your site, it may not provide links directly to your money pages, but it will nonetheless lift overall site authority.

In addition, if someone has consumed one or more pieces of great content on your site, the chance of their engaging in a more positive manner with your site overall go way up. Why? Because you’ve earned their trust and admiration.

4. Be everywhere your audiences are with more high-quality, relevant, non-commercial content

Are there major media sites that cover your market space? Do they consider you to be an expert? Will they quote you in articles they write? Can you provide them with guest posts or let you be a guest columnist? Will they collaborate on larger content projects with you?

All of these activities put you in front of their audiences, and if those audiences overlap with yours, this provides a great way to build your overall reputation and visibility. This content that you publish, or collaborate on, that shows up on 3rd-party sites will get you mentions and links. In addition, once again, it will provide you with a boost to your branding. People are now more likely to consume your other content more readily, including on your money pages.

5. Leverage social media

The concept here shares much in common with the prior point. Social media provides opportunities to get in front of relevant audiences. Every person that’s an avid follower of yours on a social media site is more likely to show very different behavior characteristics interacting with your site than someone that does not know you well at all.

Note that links from social media sites are nofollowed, but active social media behavior can lead to people implementing “real world” links to your site that are followed, from their blogs and media web sites.

6. Be active in the offline world as well

Think your offline activity doesn’t matter online? Think again. Relationships are still most easily built face-to-face. People you meet and spend time with can well become your most loyal fans online. This is particularly important when it comes to building relationships with influential people.

One great way to do that is to go to public events related to your industry, such as conferences. Better still, obtain speaking engagements at those conferences. This can even impact people who weren’t there to hear you speak, as they become aware that you have been asked to do that. This concept can also work for a small local business. Get out in your community and engage with people at local events.

The payoff here is similar to the payoff for other items: more engaged, highly loyal fans who engage with you across the web, sending more and more positive signals, both to other people and to search engines, that you are the real deal.

7. Provide great customer service/support

Whatever your business may be, you need to take care of your customers as best you can. No one can make everyone happy, that’s unrealistic, but striving for much better than average is a really sound idea. Having satisfied customers saying nice things about you online is a big impact item in the grand scheme of things.

8. Actively build relationships with influencers too

While this post is not about the value of influencer relationships, I include this in the list for illustration purposes, for two reasons:

  1. Some opportunities are worth extra effort. Know of someone who could have a major impact on your business? Know that they will be at a public event in the near future? Book your plane tickets and get your butt out there. No guarantee that you will get the result you are looking for, or that it will happen quickly, but your chances go WAY up if you get some face time with them.
  2. Influencers are worth special attention and focus, but your relationship-building approach to the web and SEO is not only about influencers. It’s about the entire ecosystem.

It’s an integrated ecosystem

The web provides a level of integrated, real-time connectivity of a kind that the world has never seen before. This is only going to increase. Do something bad to a customer in Hong Kong? Consumers in Boston will know within 5 minutes. That’s where it’s all headed.

Google and Bing (and any future search engine that may emerge) want to measure these types of signals because they tell them how to improve the quality of the experience on their platforms. There are may ways they can perform these measurements.

One simple concept is covered by Rand in this recent Whiteboard Friday video. The discussion is about a recent patent granted to Google that shows how the company can use search queries to detect who is an authority on a topic.

The example he provides is about people who search on “email finding tool”. If Google also finds that a number of people search on “voila norbert email tool”, Google may use that as an authority signal.

Think about that for a moment. How are you going to get people to search on your brand more while putting it together with a non-branded querly like that? (OK, please leave Mechanical Turk and other services like that out of the discussion).

Now you can start to see the bigger picture. Measurements like pogosticking and this recent search behavior related patent are just the tip of the iceberg. Undoubtedly, there are many other ways that search engines can measure what people like and engage with the most.

This is all part of SEO now. UX, product breadth, problem solving, UX, engaging in social media, getting face to face, creating great content that you publish in front of other people’s audiences, and more.

For the small local business, you can still win at this game, as your focus just needs to be on doing it better than your competitors. The big brands will never be hyper-local like you are, so don’t think you can’t play the game, because you can.

Whoever you are, get ready, because this new integrated ecosystem is already upon us, and you need to be a part of it.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

[ccw-atrib-link]