The Magento Xcelerate program: A positive sum game

As an open source ecommerce platform, Magento is flexible and accessible for developers to work with and as a result, an active community of developers emerged on online forums and at offline meetups all over the world. Many of these were happily plugging away independently of Magento until the split from eBay in early 2015.

Free from the reins of eBay, Magento has decisively been reaching out to, promoting and rewarding the individuals, agencies and technology providers that make up its ecosystem. Last February they announced the Magento Masters Program, empowering the top platform advocates, frequent forum contributors and the innovative solution implementers. Then at April‘s Magento Imagine conference (the largest yet) the theme emerged as ‘We are Magento”, in celebration of the community.

The new Xcelerate Technology Partner Program focuses not on individuals but on business partnerships formed with the technology companies that offer tools for Magento merchants to implement.

 Sharing ideas, opportunities and successes:

This is the Xcelerate Program tagline, which acts as a sort of mission statement to get the technology partners involved moving with regards to continuously considering Magento in their own technology roadmap and jointly communicating successes and learnings from working on implementations with merchants.

“In turn, the program offers members the tools to get moving, through events, resources and contacts. Our goal is to enable you to be an integral part of the Magento ecosystem” Jon Carmody, Head of Technology Partners

The program in practice:

The new program is accompanied by the new Marketplace from which the extensions can be purchased and downloaded. The program splits the extensions into 3 partnership levels:

Registered Partners – these are technology extensions that the new Magento Marketplace team test for code quality. Extensions must now pass this initial level to be eligible for the Marketplace. With each merchant having on average 15 extensions for their site, this is a win for merchants when it comes to extension trustworthiness.

Select Partners – extensions can enter this second tier if the technology falls into one of the strategic categories identified by Magento and if they pass an in-depth technical review. These will be marked as being ‘Select’ in the Marketplace.

Premier Partners – this level is by invitation only, chosen as providing crucial technology to Magento merchants (such as payments, marketing, tax software). The Magento team’s Extension Quality Program looks at coding structure, performance, scalability, security and compatibility but influence in the Community is also a consideration. dotmailer is proud to be the first Premier Technology Partner in the marketing space for Magento.

All in all, the latest move from Magento in illuminating its ecosystem should be positive for all; the merchants who can now choose from a vetted list of extensions and know when to expect tight integration, the technology partners building extensions now with clearer merchant needs/extension gaps in mind and guidance from Magento, and of course the solution implementers recommending the best extension for the merchant now knowing it will be maintained.

Reblogged 3 years ago from blog.dotmailer.com

Stop Ghost Spam in Google Analytics with One Filter

Posted by CarloSeo

The spam in Google Analytics (GA) is becoming a serious issue. Due to a deluge of referral spam from social buttons, adult sites, and many, many other sources, people are starting to become overwhelmed by all the filters they are setting up to manage the useless data they are receiving.

The good news is, there is no need to panic. In this post, I’m going to focus on the most common mistakes people make when fighting spam in GA, and explain an efficient way to prevent it.

But first, let’s make sure we understand how spam works. A couple of months ago, Jared Gardner wrote an excellent article explaining what referral spam is, including its intended purpose. He also pointed out some great examples of referral spam.

Types of spam

The spam in Google Analytics can be categorized by two types: ghosts and crawlers.

Ghosts

The vast majority of spam is this type. They are called ghosts because they never access your site. It is important to keep this in mind, as it’s key to creating a more efficient solution for managing spam.

As unusual as it sounds, this type of spam doesn’t have any interaction with your site at all. You may wonder how that is possible since one of the main purposes of GA is to track visits to our sites.

They do it by using the Measurement Protocol, which allows people to send data directly to Google Analytics’ servers. Using this method, and probably randomly generated tracking codes (UA-XXXXX-1) as well, the spammers leave a “visit” with fake data, without even knowing who they are hitting.

Crawlers

This type of spam, the opposite to ghost spam, does access your site. As the name implies, these spam bots crawl your pages, ignoring rules like those found in robots.txt that are supposed to stop them from reading your site. When they exit your site, they leave a record on your reports that appears similar to a legitimate visit.

Crawlers are harder to identify because they know their targets and use real data. But it is also true that new ones seldom appear. So if you detect a referral in your analytics that looks suspicious, researching it on Google or checking it against this list might help you answer the question of whether or not it is spammy.

Most common mistakes made when dealing with spam in GA

I’ve been following this issue closely for the last few months. According to the comments people have made on my articles and conversations I’ve found in discussion forums, there are primarily three mistakes people make when dealing with spam in Google Analytics.

Mistake #1. Blocking ghost spam from the .htaccess file

One of the biggest mistakes people make is trying to block Ghost Spam from the .htaccess file.

For those who are not familiar with this file, one of its main functions is to allow/block access to your site. Now we know that ghosts never reach your site, so adding them here won’t have any effect and will only add useless lines to your .htaccess file.

Ghost spam usually shows up for a few days and then disappears. As a result, sometimes people think that they successfully blocked it from here when really it’s just a coincidence of timing.

Then when the spammers later return, they get worried because the solution is not working anymore, and they think the spammer somehow bypassed the barriers they set up.

The truth is, the .htaccess file can only effectively block crawlers such as buttons-for-website.com and a few others since these access your site. Most of the spam can’t be blocked using this method, so there is no other option than using filters to exclude them.

Mistake #2. Using the referral exclusion list to stop spam

Another error is trying to use the referral exclusion list to stop the spam. The name may confuse you, but this list is not intended to exclude referrals in the way we want to for the spam. It has other purposes.

For example, when a customer buys something, sometimes they get redirected to a third-party page for payment. After making a payment, they’re redirected back to you website, and GA records that as a new referral. It is appropriate to use referral exclusion list to prevent this from happening.

If you try to use the referral exclusion list to manage spam, however, the referral part will be stripped since there is no preexisting record. As a result, a direct visit will be recorded, and you will have a bigger problem than the one you started with since. You will still have spam, and direct visits are harder to track.

Mistake #3. Worrying that bounce rate changes will affect rankings

When people see that the bounce rate changes drastically because of the spam, they start worrying about the impact that it will have on their rankings in the SERPs.

bounce.png

This is another mistake commonly made. With or without spam, Google doesn’t take into consideration Google Analytics metrics as a ranking factor. Here is an explanation about this from Matt Cutts, the former head of Google’s web spam team.

And if you think about it, Cutts’ explanation makes sense; because although many people have GA, not everyone uses it.

Assuming your site has been hacked

Another common concern when people see strange landing pages coming from spam on their reports is that they have been hacked.

landing page

The page that the spam shows on the reports doesn’t exist, and if you try to open it, you will get a 404 page. Your site hasn’t been compromised.

But you have to make sure the page doesn’t exist. Because there are cases (not spam) where some sites have a security breach and get injected with pages full of bad keywords to defame the website.

What should you worry about?

Now that we’ve discarded security issues and their effects on rankings, the only thing left to worry about is your data. The fake trail that the spam leaves behind pollutes your reports.

It might have greater or lesser impact depending on your site traffic, but everyone is susceptible to the spam.

Small and midsize sites are the most easily impacted – not only because a big part of their traffic can be spam, but also because usually these sites are self-managed and sometimes don’t have the support of an analyst or a webmaster.

Big sites with a lot of traffic can also be impacted by spam, and although the impact can be insignificant, invalid traffic means inaccurate reports no matter the size of the website. As an analyst, you should be able to explain what’s going on in even in the most granular reports.

You only need one filter to deal with ghost spam

Usually it is recommended to add the referral to an exclusion filter after it is spotted. Although this is useful for a quick action against the spam, it has three big disadvantages.

  • Making filters every week for every new spam detected is tedious and time-consuming, especially if you manage many sites. Plus, by the time you apply the filter, and it starts working, you already have some affected data.
  • Some of the spammers use direct visits along with the referrals.
  • These direct hits won’t be stopped by the filter so even if you are excluding the referral you will sill be receiving invalid traffic, which explains why some people have seen an unusual spike in direct traffic.

Luckily, there is a good way to prevent all these problems. Most of the spam (ghost) works by hitting GA’s random tracking-IDs, meaning the offender doesn’t really know who is the target, and for that reason either the hostname is not set or it uses a fake one. (See report below)

Ghost-Spam.png

You can see that they use some weird names or don’t even bother to set one. Although there are some known names in the list, these can be easily added by the spammer.

On the other hand, valid traffic will always use a real hostname. In most of the cases, this will be the domain. But it also can also result from paid services, translation services, or any other place where you’ve inserted GA tracking code.

Valid-Referral.png

Based on this, we can make a filter that will include only hits that use real hostnames. This will automatically exclude all hits from ghost spam, whether it shows up as a referral, keyword, or pageview; or even as a direct visit.

To create this filter, you will need to find the report of hostnames. Here’s how:

  1. Go to the Reporting tab in GA
  2. Click on Audience in the lefthand panel
  3. Expand Technology and select Network
  4. At the top of the report, click on Hostname

Valid-list

You will see a list of all hostnames, including the ones that the spam uses. Make a list of all the valid hostnames you find, as follows:

  • yourmaindomain.com
  • blog.yourmaindomain.com
  • es.yourmaindomain.com
  • payingservice.com
  • translatetool.com
  • anotheruseddomain.com

For small to medium sites, this list of hostnames will likely consist of the main domain and a couple of subdomains. After you are sure you got all of them, create a regular expression similar to this one:

yourmaindomain\.com|anotheruseddomain\.com|payingservice\.com|translatetool\.com

You don’t need to put all of your subdomains in the regular expression. The main domain will match all of them. If you don’t have a view set up without filters, create one now.

Then create a Custom Filter.

Make sure you select INCLUDE, then select “Hostname” on the filter field, and copy your expression into the Filter Pattern box.

filter

You might want to verify the filter before saving to check that everything is okay. Once you’re ready, set it to save, and apply the filter to all the views you want (except the view without filters).

This single filter will get rid of future occurrences of ghost spam that use invalid hostnames, and it doesn’t require much maintenance. But it’s important that every time you add your tracking code to any service, you add it to the end of the filter.

Now you should only need to take care of the crawler spam. Since crawlers access your site, you can block them by adding these lines to the .htaccess file:

## STOP REFERRER SPAM 
RewriteCond %{HTTP_REFERER} semalt\.com [NC,OR] 
RewriteCond %{HTTP_REFERER} buttons-for-website\.com [NC] 
RewriteRule .* - [F]

It is important to note that this file is very sensitive, and misplacing a single character it it can bring down your entire site. Therefore, make sure you create a backup copy of your .htaccess file prior to editing it.

If you don’t feel comfortable messing around with your .htaccess file, you can alternatively make an expression with all the crawlers, then and add it to an exclude filter by Campaign Source.

Implement these combined solutions, and you will worry much less about spam contaminating your analytics data. This will have the added benefit of freeing up more time for you to spend actually analyze your valid data.

After stopping spam, you can also get clean reports from the historical data by using the same expressions in an Advance Segment to exclude all the spam.

Bonus resources to help you manage spam

If you still need more information to help you understand and deal with the spam on your GA reports, you can read my main article on the subject here: http://www.ohow.co/what-is-referrer-spam-how-stop-it-guide/.

Additional information on how to stop spam can be found at these URLs:

In closing, I am eager to hear your ideas on this serious issue. Please share them in the comments below.

(Editor’s Note: All images featured in this post were created by the author.)

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Why Effective, Modern SEO Requires Technical, Creative, and Strategic Thinking – Whiteboard Friday

Posted by randfish

There’s no doubt that quite a bit has changed about SEO, and that the field is far more integrated with other aspects of online marketing than it once was. In today’s Whiteboard Friday, Rand pushes back against the idea that effective modern SEO doesn’t require any technical expertise, outlining a fantastic list of technical elements that today’s SEOs need to know about in order to be truly effective.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

Video transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week I’m going to do something unusual. I don’t usually point out these inconsistencies or sort of take issue with other folks’ content on the web, because I generally find that that’s not all that valuable and useful. But I’m going to make an exception here.

There is an article by Jayson DeMers, who I think might actually be here in Seattle — maybe he and I can hang out at some point — called “Why Modern SEO Requires Almost No Technical Expertise.” It was an article that got a shocking amount of traction and attention. On Facebook, it has thousands of shares. On LinkedIn, it did really well. On Twitter, it got a bunch of attention.

Some folks in the SEO world have already pointed out some issues around this. But because of the increasing popularity of this article, and because I think there’s, like, this hopefulness from worlds outside of kind of the hardcore SEO world that are looking to this piece and going, “Look, this is great. We don’t have to be technical. We don’t have to worry about technical things in order to do SEO.”

Look, I completely get the appeal of that. I did want to point out some of the reasons why this is not so accurate. At the same time, I don’t want to rain on Jayson, because I think that it’s very possible he’s writing an article for Entrepreneur, maybe he has sort of a commitment to them. Maybe he had no idea that this article was going to spark so much attention and investment. He does make some good points. I think it’s just really the title and then some of the messages inside there that I take strong issue with, and so I wanted to bring those up.

First off, some of the good points he did bring up.

One, he wisely says, “You don’t need to know how to code or to write and read algorithms in order to do SEO.” I totally agree with that. If today you’re looking at SEO and you’re thinking, “Well, am I going to get more into this subject? Am I going to try investing in SEO? But I don’t even know HTML and CSS yet.”

Those are good skills to have, and they will help you in SEO, but you don’t need them. Jayson’s totally right. You don’t have to have them, and you can learn and pick up some of these things, and do searches, watch some Whiteboard Fridays, check out some guides, and pick up a lot of that stuff later on as you need it in your career. SEO doesn’t have that hard requirement.

And secondly, he makes an intelligent point that we’ve made many times here at Moz, which is that, broadly speaking, a better user experience is well correlated with better rankings.

You make a great website that delivers great user experience, that provides the answers to searchers’ questions and gives them extraordinarily good content, way better than what’s out there already in the search results, generally speaking you’re going to see happy searchers, and that’s going to lead to higher rankings.

But not entirely. There are a lot of other elements that go in here. So I’ll bring up some frustrating points around the piece as well.

First off, there’s no acknowledgment — and I find this a little disturbing — that the ability to read and write code, or even HTML and CSS, which I think are the basic place to start, is helpful or can take your SEO efforts to the next level. I think both of those things are true.

So being able to look at a web page, view source on it, or pull up Firebug in Firefox or something and diagnose what’s going on and then go, “Oh, that’s why Google is not able to see this content. That’s why we’re not ranking for this keyword or term, or why even when I enter this exact sentence in quotes into Google, which is on our page, this is why it’s not bringing it up. It’s because it’s loading it after the page from a remote file that Google can’t access.” These are technical things, and being able to see how that code is built, how it’s structured, and what’s going on there, very, very helpful.

Some coding knowledge also can take your SEO efforts even further. I mean, so many times, SEOs are stymied by the conversations that we have with our programmers and our developers and the technical staff on our teams. When we can have those conversations intelligently, because at least we understand the principles of how an if-then statement works, or what software engineering best practices are being used, or they can upload something into a GitHub repository, and we can take a look at it there, that kind of stuff is really helpful.

Secondly, I don’t like that the article overly reduces all of this information that we have about what we’ve learned about Google. So he mentions two sources. One is things that Google tells us, and others are SEO experiments. I think both of those are true. Although I’d add that there’s sort of a sixth sense of knowledge that we gain over time from looking at many, many search results and kind of having this feel for why things rank, and what might be wrong with a site, and getting really good at that using tools and data as well. There are people who can look at Open Site Explorer and then go, “Aha, I bet this is going to happen.” They can look, and 90% of the time they’re right.

So he boils this down to, one, write quality content, and two, reduce your bounce rate. Neither of those things are wrong. You should write quality content, although I’d argue there are lots of other forms of quality content that aren’t necessarily written — video, images and graphics, podcasts, lots of other stuff.

And secondly, that just doing those two things is not always enough. So you can see, like many, many folks look and go, “I have quality content. It has a low bounce rate. How come I don’t rank better?” Well, your competitors, they’re also going to have quality content with a low bounce rate. That’s not a very high bar.

Also, frustratingly, this really gets in my craw. I don’t think “write quality content” means anything. You tell me. When you hear that, to me that is a totally non-actionable, non-useful phrase that’s a piece of advice that is so generic as to be discardable. So I really wish that there was more substance behind that.

The article also makes, in my opinion, the totally inaccurate claim that modern SEO really is reduced to “the happier your users are when they visit your site, the higher you’re going to rank.”

Wow. Okay. Again, I think broadly these things are correlated. User happiness and rank is broadly correlated, but it’s not a one to one. This is not like a, “Oh, well, that’s a 1.0 correlation.”

I would guess that the correlation is probably closer to like the page authority range. I bet it’s like 0.35 or something correlation. If you were to actually measure this broadly across the web and say like, “Hey, were you happier with result one, two, three, four, or five,” the ordering would not be perfect at all. It probably wouldn’t even be close.

There’s a ton of reasons why sometimes someone who ranks on Page 2 or Page 3 or doesn’t rank at all for a query is doing a better piece of content than the person who does rank well or ranks on Page 1, Position 1.

Then the article suggests five and sort of a half steps to successful modern SEO, which I think is a really incomplete list. So Jayson gives us;

  • Good on-site experience
  • Writing good content
  • Getting others to acknowledge you as an authority
  • Rising in social popularity
  • Earning local relevance
  • Dealing with modern CMS systems (which he notes most modern CMS systems are SEO-friendly)

The thing is there’s nothing actually wrong with any of these. They’re all, generally speaking, correct, either directly or indirectly related to SEO. The one about local relevance, I have some issue with, because he doesn’t note that there’s a separate algorithm for sort of how local SEO is done and how Google ranks local sites in maps and in their local search results. Also not noted is that rising in social popularity won’t necessarily directly help your SEO, although it can have indirect and positive benefits.

I feel like this list is super incomplete. Okay, I brainstormed just off the top of my head in the 10 minutes before we filmed this video a list. The list was so long that, as you can see, I filled up the whole whiteboard and then didn’t have any more room. I’m not going to bother to erase and go try and be absolutely complete.

But there’s a huge, huge number of things that are important, critically important for technical SEO. If you don’t know how to do these things, you are sunk in many cases. You can’t be an effective SEO analyst, or consultant, or in-house team member, because you simply can’t diagnose the potential problems, rectify those potential problems, identify strategies that your competitors are using, be able to diagnose a traffic gain or loss. You have to have these skills in order to do that.

I’ll run through these quickly, but really the idea is just that this list is so huge and so long that I think it’s very, very, very wrong to say technical SEO is behind us. I almost feel like the opposite is true.

We have to be able to understand things like;

  • Content rendering and indexability
  • Crawl structure, internal links, JavaScript, Ajax. If something’s post-loading after the page and Google’s not able to index it, or there are links that are accessible via JavaScript or Ajax, maybe Google can’t necessarily see those or isn’t crawling them as effectively, or is crawling them, but isn’t assigning them as much link weight as they might be assigning other stuff, and you’ve made it tough to link to them externally, and so they can’t crawl it.
  • Disabling crawling and/or indexing of thin or incomplete or non-search-targeted content. We have a bunch of search results pages. Should we use rel=prev/next? Should we robots.txt those out? Should we disallow from crawling with meta robots? Should we rel=canonical them to other pages? Should we exclude them via the protocols inside Google Webmaster Tools, which is now Google Search Console?
  • Managing redirects, domain migrations, content updates. A new piece of content comes out, replacing an old piece of content, what do we do with that old piece of content? What’s the best practice? It varies by different things. We have a whole Whiteboard Friday about the different things that you could do with that. What about a big redirect or a domain migration? You buy another company and you’re redirecting their site to your site. You have to understand things about subdomain structures versus subfolders, which, again, we’ve done another Whiteboard Friday about that.
  • Proper error codes, downtime procedures, and not found pages. If your 404 pages turn out to all be 200 pages, well, now you’ve made a big error there, and Google could be crawling tons of 404 pages that they think are real pages, because you’ve made it a status code 200, or you’ve used a 404 code when you should have used a 410, which is a permanently removed, to be able to get it completely out of the indexes, as opposed to having Google revisit it and keep it in the index.

Downtime procedures. So there’s specifically a… I can’t even remember. It’s a 5xx code that you can use. Maybe it was a 503 or something that you can use that’s like, “Revisit later. We’re having some downtime right now.” Google urges you to use that specific code rather than using a 404, which tells them, “This page is now an error.”

Disney had that problem a while ago, if you guys remember, where they 404ed all their pages during an hour of downtime, and then their homepage, when you searched for Disney World, was, like, “Not found.” Oh, jeez, Disney World, not so good.

  • International and multi-language targeting issues. I won’t go into that. But you have to know the protocols there. Duplicate content, syndication, scrapers. How do we handle all that? Somebody else wants to take our content, put it on their site, what should we do? Someone’s scraping our content. What can we do? We have duplicate content on our own site. What should we do?
  • Diagnosing traffic drops via analytics and metrics. Being able to look at a rankings report, being able to look at analytics connecting those up and trying to see: Why did we go up or down? Did we have less pages being indexed, more pages being indexed, more pages getting traffic less, more keywords less?
  • Understanding advanced search parameters. Today, just today, I was checking out the related parameter in Google, which is fascinating for most sites. Well, for Moz, weirdly, related:oursite.com shows nothing. But for virtually every other sit, well, most other sites on the web, it does show some really interesting data, and you can see how Google is connecting up, essentially, intentions and topics from different sites and pages, which can be fascinating, could expose opportunities for links, could expose understanding of how they view your site versus your competition or who they think your competition is.

Then there are tons of parameters, like in URL and in anchor, and da, da, da, da. In anchor doesn’t work anymore, never mind about that one.

I have to go faster, because we’re just going to run out of these. Like, come on. Interpreting and leveraging data in Google Search Console. If you don’t know how to use that, Google could be telling you, you have all sorts of errors, and you don’t know what they are.

  • Leveraging topic modeling and extraction. Using all these cool tools that are coming out for better keyword research and better on-page targeting. I talked about a couple of those at MozCon, like MonkeyLearn. There’s the new Moz Context API, which will be coming out soon, around that. There’s the Alchemy API, which a lot of folks really like and use.
  • Identifying and extracting opportunities based on site crawls. You run a Screaming Frog crawl on your site and you’re going, “Oh, here’s all these problems and issues.” If you don’t have these technical skills, you can’t diagnose that. You can’t figure out what’s wrong. You can’t figure out what needs fixing, what needs addressing.
  • Using rich snippet format to stand out in the SERPs. This is just getting a better click-through rate, which can seriously help your site and obviously your traffic.
  • Applying Google-supported protocols like rel=canonical, meta description, rel=prev/next, hreflang, robots.txt, meta robots, x robots, NOODP, XML sitemaps, rel=nofollow. The list goes on and on and on. If you’re not technical, you don’t know what those are, you think you just need to write good content and lower your bounce rate, it’s not going to work.
  • Using APIs from services like AdWords or MozScape, or hrefs from Majestic, or SEM refs from SearchScape or Alchemy API. Those APIs can have powerful things that they can do for your site. There are some powerful problems they could help you solve if you know how to use them. It’s actually not that hard to write something, even inside a Google Doc or Excel, to pull from an API and get some data in there. There’s a bunch of good tutorials out there. Richard Baxter has one, Annie Cushing has one, I think Distilled has some. So really cool stuff there.
  • Diagnosing page load speed issues, which goes right to what Jayson was talking about. You need that fast-loading page. Well, if you don’t have any technical skills, you can’t figure out why your page might not be loading quickly.
  • Diagnosing mobile friendliness issues
  • Advising app developers on the new protocols around App deep linking, so that you can get the content from your mobile apps into the web search results on mobile devices. Awesome. Super powerful. Potentially crazy powerful, as mobile search is becoming bigger than desktop.

Okay, I’m going to take a deep breath and relax. I don’t know Jayson’s intention, and in fact, if he were in this room, he’d be like, “No, I totally agree with all those things. I wrote the article in a rush. I had no idea it was going to be big. I was just trying to make the broader points around you don’t have to be a coder in order to do SEO.” That’s completely fine.

So I’m not going to try and rain criticism down on him. But I think if you’re reading that article, or you’re seeing it in your feed, or your clients are, or your boss is, or other folks are in your world, maybe you can point them to this Whiteboard Friday and let them know, no, that’s not quite right. There’s a ton of technical SEO that is required in 2015 and will be for years to come, I think, that SEOs have to have in order to be effective at their jobs.

All right, everyone. Look forward to some great comments, and we’ll see you again next time for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Case Study: How I Turned Autocomplete Ideas into Traffic & Ranking Results with Only 5 Hours of Effort

Posted by jamiejpress

Many of us have known for a while that Google Autocomplete can be a useful tool for identifying keyword opportunities. But did you know it is also an extremely powerful tool for content ideation?

And by pushing the envelope a little further, you can turn an Autocomplete topic from a good content idea into a link-building, traffic-generating powerhouse for your website.

Here’s how I did it for one of my clients. They are in the diesel power generator industry in the Australian market, but you can use this same process for businesses in literally any industry and market you can think of.

Step 1: Find the spark of an idea using Google Autocomplete

I start by seeking out long-tail keyword ideas from Autocomplete. By typing in some of my client’s core keywords, I come across one that sparked my interest in particular—diesel generator fuel consumption.

What’s more, the Google AdWords Keyword Planner says it is a high competition term. So advertisers are prepared to spend good money on this phrase—all the better to try to rank well organically for the term. We want to get the traffic without incurring the click costs.

keyword_planner.png

Step 2: Check the competition and find an edge

Next, we find out what pages rank well for the phrase, and then identify how we can do better, with user experience top of mind.

In the case of “diesel generator fuel consumption” in Google.com.au, the top-ranking page is this one: a US-focused piece of content using gallons instead of litres.

top_ranking_page.png

This observation, paired with the fact that the #2 Autocomplete suggestion was “diesel generator fuel consumption in litres” gives me the right slant for the content that will give us the edge over the top competing page: Why not create a table using metric measurements instead of imperial measurements for our Australian audience?

So that’s what I do.

I work with the client to gather the information and create the post on the their website. Also, I insert the target phrase in the page title, meta description, URL, and once in the body content. We also create a PDF downloadable with similar content.

client_content.png

Note: While figuring out how to make product/service pages better than those of competitors is the age-old struggle when it comes to working on core SEO keywords, with longer-tail keywords like the ones you work with using this tactic, users generally want detailed information, answers to questions, or implementable tips. So it makes it a little easier to figure out how you can do it better by putting yourself in the user’s shoes.

Step 3: Find the right way to market the content

If people are searching for the term in Google, then there must also be people on forums asking about it.

A quick search through Quora, Reddit and an other forums brings up some relevant threads. I engage with the users in these forums and add non-spammy, helpful no-followed links to our new content in answering their questions.

Caveat: Forum marketing has had a bad reputation for some time, and rightly so, as SEOs have abused the tactic. Before you go linking to your content in forums, I strongly recommend you check out this resource on the right way to engage in forum marketing.

Okay, what about the results?

Since I posted the page in December 2014, referral traffic from the forums has been picking up speed; organic traffic to the page keeps building, too.

referral_traffic.png

organic_traffic.jpg

Yeah, yeah, but what about keyword rankings?

While we’re yet to hit the top-ranking post off its perch (give us time!), we are sitting at #2 and #3 in the search results as I write this. So it looks like creating that downloadable PDF paid off.

ranking.jpg

All in all, this tactic took minimal time to plan and execute—content ideation, research and creation (including the PDF version) took three hours, while link building research and implementation took an additional two hours. That’s only five hours, yet the payoff for the client is already evident, and will continue to grow in the coming months.

Why not take a crack at using this technique yourself? I would love to hear how your ideas about how you could use it to benefit your business or clients.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Becoming Better SEO Scientists – Whiteboard Friday

Posted by MarkTraphagen

Editor’s note: Today we’re featuring back-to-back episodes of Whiteboard Friday from our friends at Stone Temple Consulting. Make sure to also check out the second episode, “UX, Content Quality, and SEO” from Eric Enge.

Like many other areas of marketing, SEO incorporates elements of science. It becomes problematic for everyone, though, when theories that haven’t been the subject of real scientific rigor are passed off as proven facts. In today’s Whiteboard Friday, Stone Temple Consulting’s Mark Traphagen is here to teach us a thing or two about the scientific method and how it can be applied to our day-to-day work.

For reference, here’s a still of this week’s whiteboard.
Click on it to open a high resolution image in a new tab!

Video transcription

Howdy, Mozzers. Mark Traphagen from Stone Temple Consulting here today to share with you how to become a better SEO scientist. We know that SEO is a science in a lot of ways, and everything I’m going to say today applies not only to SEO, but testing things like your AdWords, how does that work, quality scores. There’s a lot of different applications you can make in marketing, but we’ll focus on the SEO world because that’s where we do a lot of testing. What I want to talk to you about today is how that really is a science and how we need to bring better science in it to get better results.

The reason is in astrophysics, things like that we know there’s something that they’re talking about these days called dark matter, and dark matter is something that we know it’s there. It’s pretty much accepted that it’s there. We can’t see it. We can’t measure it directly. We don’t even know what it is. We can’t even imagine what it is yet, and yet we know it’s there because we see its effect on things like gravity and mass. Its effects are everywhere. And that’s a lot like search engines, isn’t it? It’s like Google or Bing. We see the effects, but we don’t see inside the machine. We don’t know exactly what’s happening in there.

An artist’s depiction of how search engines work.

So what do we do? We do experiments. We do tests to try to figure that out, to see the effects, and from the effects outside we can make better guesses about what’s going on inside and do a better job of giving those search engines what they need to connect us with our customers and prospects. That’s the goal in the end.

Now, the problem is there’s a lot of testing going on out there, a lot of experiments that maybe aren’t being run very well. They’re not being run according to scientific principles that have been proven over centuries to get the best possible results.

Basic data science in 10 steps

So today I want to give you just very quickly 10 basic things that a real scientist goes through on their way to trying to give you better data. Let’s see what we can do with those in our SEO testing in the future.

So let’s start with number one. You’ve got to start with a hypothesis. Your hypothesis is the question that you want to solve. You always start with that, a good question in mind, and it’s got to be relatively narrow. You’ve got to narrow it down to something very specific. Something like how does time on page effect rankings, that’s pretty narrow. That’s very specific. That’s a good question. Might be able to test that. But something like how do social signals effect rankings, that’s too broad. You’ve got to narrow it down. Get it down to one simple question.

Then you choose a variable that you’re going to test. Out of all the things that you could do, that you could play with or you could tweak, you should choose one thing or at least a very few things that you’re going to tweak and say, “When we tweak this, when we change this, when we do this one thing, what happens? Does it change anything out there in the world that we are looking at?” That’s the variable.

The next step is to set a sample group. Where are you going to gather the data from? Where is it going to come from? That’s the world that you’re working in here. Out of all the possible data that’s out there, where are you going to gather your data and how much? That’s the small circle within the big circle. Now even though it’s smaller, you’re probably not going to get all the data in the world. You’re not going to scrape every search ranking that’s possible or visit every URL.

You’ve got to ask yourself, “Is it large enough that we’re at least going to get some validity?” If I wanted to find out what is the typical person in Seattle and I might walk through just one part of the Moz offices here, I’d get some kind of view. But is that a typical, average person from Seattle? I’ve been around here at Moz. Probably not. But this was large enough.

Also, it should be randomized as much as possible. Again, going back to that example, if I just stayed here within the walls of Moz and do research about Mozzers, I’d learn a lot about what Mozzers do, what Mozzers think, how they behave. But that may or may not be applicable to the larger world outside, so you randomized.

We want to control. So we’ve got our sample group. If possible, it’s always good to have another sample group that you don’t do anything to. You do not manipulate the variable in that group. Now, why do you have that? You have that so that you can say, to some extent, if we saw a change when we manipulated our variable and we did not see it in the control group, the same thing didn’t happen, more likely it’s not just part of the natural things that happen in the world or in the search engine.

If possible, even better you want to make that what scientists call double blind, which means that even you the experimenter don’t know who that control group is out of all the SERPs that you’re looking at or whatever it is. As careful as you might be and honest as you might be, you can end up manipulating the results if you know who is who within the test group? It’s not going to apply to every test that we do in SEO, but a good thing to have in mind as you work on that.

Next, very quickly, duration. How long does it have to be? Is there sufficient time? If you’re just testing like if I share a URL to Google +, how quickly does it get indexed in the SERPs, you might only need a day on that because typically it takes less than a day in that case. But if you’re looking at seasonality effects, you might need to go over several years to get a good test on that.

Let’s move to the second group here. The sixth thing keep a clean lab. Now what that means is try as much as possible to keep anything that might be dirtying your results, any kind of variables creeping in that you didn’t want to have in the test. Hard to do, especially in what we’re testing, but do the best you can to keep out the dirt.

Manipulate only one variable. Out of all the things that you could tweak or change choose one thing or a very small set of things. That will give more accuracy to your test. The more variables that you change, the more other effects and inner effects that are going to happen that you may not be accounting for and are going to muddy your results.

Make sure you have statistical validity when you go to analyze those results. Now that’s beyond the scope of this little talk, but you can read up on that. Or even better, if you are able to, hire somebody or work with somebody who is a trained data scientist or has training in statistics so they can look at your evaluation and say the correlations or whatever you’re seeing, “Does it have a statistical significance?” Very important.

Transparency. As much as possible, share with the world your data set, your full results, your methodology. What did you do? How did you set up the study? That’s going to be important to our last step here, which is replication and falsification, one of the most important parts of any scientific process.

So what you want to invite is, hey we did this study. We did this test. Here’s what we found. Here’s how we did it. Here’s the data. If other people ask the same question again and run the same kind of test, do they get the same results? Somebody runs it again, do they get the same results? Even better, if you have some people out there who say, “I don’t think you’re right about that because I think you missed this, and I’m going to throw this in and see what happens,” aha they falsify. That might make you feel like you failed, but it’s success because in the end what are we after? We’re after the truth about what really works.

Think about your next test, your next experiment that you do. How can you apply these 10 principles to do better testing, get better results, and have better marketing? Thanks.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Eliminate Duplicate Content in Faceted Navigation with Ajax/JSON/JQuery

Posted by EricEnge

One of the classic problems in SEO is that while complex navigation schemes may be useful to users, they create problems for search engines. Many publishers rely on tags such as rel=canonical, or the parameters settings in Webmaster Tools to try and solve these types of issues. However, each of the potential solutions has limitations. In today’s post, I am going to outline how you can use JavaScript solutions to more completely eliminate the problem altogether.

Note that I am not going to provide code examples in this post, but I am going to outline how it works on a conceptual level. If you are interested in learning more about Ajax/JSON/jQuery here are some resources you can check out:

  1. Ajax Tutorial
  2. Learning Ajax/jQuery

Defining the problem with faceted navigation

Having a page of products and then allowing users to sort those products the way they want (sorted from highest to lowest price), or to use a filter to pick a subset of the products (only those over $60) makes good sense for users. We typically refer to these types of navigation options as “faceted navigation.”

However, faceted navigation can cause problems for search engines because they don’t want to crawl and index all of your different sort orders or all your different filtered versions of your pages. They would end up with many different variants of your pages that are not significantly different from a search engine user experience perspective.

Solutions such as rel=canonical tags and parameters settings in Webmaster Tools have some limitations. For example, rel=canonical tags are considered “hints” by the search engines, and they may not choose to accept them, and even if they are accepted, they do not necessarily keep the search engines from continuing to crawl those pages.

A better solution might be to use JSON and jQuery to implement your faceted navigation so that a new page is not created when a user picks a filter or a sort order. Let’s take a look at how it works.

Using JSON and jQuery to filter on the client side

The main benefit of the implementation discussed below is that a new URL is not created when a user is on a page of yours and applies a filter or sort order. When you use JSON and jQuery, the entire process happens on the client device without involving your web server at all.

When a user initially requests one of the product pages on your web site, the interaction looks like this:

using json on faceted navigation

This transfers the page to the browser the user used to request the page. Now when a user picks a sort order (or filter) on that page, here is what happens:

jquery and faceted navigation diagram

When the user picks one of those options, a jQuery request is made to the JSON data object. Translation: the entire interaction happens within the client’s browser and the sort or filter is applied there. Simply put, the smarts to handle that sort or filter resides entirely within the code on the client device that was transferred with the initial request for the page.

As a result, there is no new page created and no new URL for Google or Bing to crawl. Any concerns about crawl budget or inefficient use of PageRank are completely eliminated. This is great stuff! However, there remain limitations in this implementation.

Specifically, if your list of products spans multiple pages on your site, the sorting and filtering will only be applied to the data set already transferred to the user’s browser with the initial request. In short, you may only be sorting the first page of products, and not across the entire set of products. It’s possible to have the initial JSON data object contain the full set of pages, but this may not be a good idea if the page size ends up being large. In that event, we will need to do a bit more.

What Ajax does for you

Now we are going to dig in slightly deeper and outline how Ajax will allow us to handle sorting, filtering, AND pagination. Warning: There is some tech talk in this section, but I will try to follow each technical explanation with a layman’s explanation about what’s happening.

The conceptual Ajax implementation looks like this:

ajax and faceted navigation diagram

In this structure, we are using an Ajax layer to manage the communications with the web server. Imagine that we have a set of 10 pages, the user has gotten the first page of those 10 on their device and then requests a change to the sort order. The Ajax requests a fresh set of data from the web server for your site, similar to a normal HTML transaction, except that it runs asynchronously in a separate thread.

If you don’t know what that means, the benefit is that the rest of the page can load completely while the process to capture the data that the Ajax will display is running in parallel. This will be things like your main menu, your footer links to related products, and other page elements. This can improve the perceived performance of the page.

When a user selects a different sort order, the code registers an event handler for a given object (e.g. HTML Element or other DOM objects) and then executes an action. The browser will perform the action in a different thread to trigger the event in the main thread when appropriate. This happens without needing to execute a full page refresh, only the content controlled by the Ajax refreshes.

To translate this for the non-technical reader, it just means that we can update the sort order of the page, without needing to redraw the entire page, or change the URL, even in the case of a paginated sequence of pages. This is a benefit because it can be faster than reloading the entire page, and it should make it clear to search engines that you are not trying to get some new page into their index.

Effectively, it does this within the existing Document Object Model (DOM), which you can think of as the basic structure of the documents and a spec for the way the document is accessed and manipulated.

How will Google handle this type of implementation?

For those of you who read Adam Audette’s excellent recent post on the tests his team performed on how Google reads Javascript, you may be wondering if Google will still load all these page variants on the same URL anyway, and if they will not like it.

I had the same question, so I reached out to Google’s Gary Illyes to get an answer. Here is the dialog that transpired:

Eric Enge: I’d like to ask you about using JSON and jQuery to render different sort orders and filters within the same URL. I.e. the user selects a sort order or a filter, and the content is reordered and redrawn on the page on the client site. Hence no new URL would be created. It’s effectively a way of canonicalizing the content, since each variant is a strict subset.

Then there is a second level consideration with this approach, which involves doing the same thing with pagination. I.e. you have 10 pages of products, and users still have sorting and filtering options. In order to support sorting and filtering across the entire 10 page set, you use an Ajax solution, so all of that still renders on one URL.

So, if you are on page 1, and a user executes a sort, they get that all back in that one page. However, to do this right, going to page 2 would also render on the same URL. Effectively, you are taking the 10 page set and rendering it all within one URL. This allows sorting, filtering, and pagination without needing to use canonical, noindex, prev/next, or robots.txt.

If this was not problematic for Google, the only downside is that it makes the pagination not visible to Google. Does that make sense, or is it a bad idea?

Gary Illyes
: If you have one URL only, and people have to click on stuff to see different sort orders or filters for the exact same content under that URL, then typically we would only see the default content.

If you don’t have pagination information, that’s not a problem, except we might not see the content on the other pages that are not contained in the HTML within the initial page load. The meaning of rel-prev/next is to funnel the signals from child pages (page 2, 3, 4, etc.) to the group of pages as a collection, or to the view-all page if you have one. If you simply choose to render those paginated versions on a single URL, that will have the same impact from a signals point of view, meaning that all signals will go to a single entity, rather than distributed to several URLs.

Summary

Keep in mind, the reason why Google implemented tags like rel=canonical, NoIndex, rel=prev/next, and others is to reduce their crawling burden and overall page bloat and to help focus signals to incoming pages in the best way possible. The use of Ajax/JSON/jQuery as outlined above does this simply and elegantly.

On most e-commerce sites, there are many different “facets” of how a user might want to sort and filter a list of products. With the Ajax-style implementation, this can be done without creating new pages. The end users get the control they are looking for, the search engines don’t have to deal with excess pages they don’t want to see, and signals in to the site (such as links) are focused on the main pages where they should be.

The one downside is that Google may not see all the content when it is paginated. A site that has lots of very similar products in a paginated list does not have to worry too much about Google seeing all the additional content, so this isn’t much of a concern if your incremental pages contain more of what’s on the first page. Sites that have content that is materially different on the additional pages, however, might not want to use this approach.

These solutions do require Javascript coding expertise but are not really that complex. If you have the ability to consider a path like this, you can free yourself from trying to understand the various tags, their limitations, and whether or not they truly accomplish what you are looking for.

Credit: Thanks for Clark Lefavour for providing a review of the above for technical correctness.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Misuses of 4 Google Analytics Metrics Debunked

Posted by Tom.Capper

In this post I’ll pull apart four of the most commonly used metrics in Google Analytics, how they are collected, and why they are so easily misinterpreted.

Average Time on Page

Average time on page should be a really useful metric, particularly if you’re interested in engagement with content that’s all on a single page. Unfortunately, this is actually its worst use case. To understand why, you need to understand how time on page is calculated in Google Analytics:

Time on Page: Total across all pageviews of time from pageview to last engagement hit on that page (where an engagement hit is any of: next pageview, interactive event, e-commerce transaction, e-commerce item hit, or social plugin). (Source)

If there is no subsequent engagement hit, or if there is a gap between the last engagement hit on a site and leaving the site, the assumption is that no further time was spent on the site. Below are some scenarios with an intuitive time on page of 20 seconds, and their Google Analytics time on page:

Scenario

Intuitive time on page

GA time on page

0s: Pageview
10s: Social plugin
20s: Click through to next page

20s

20s

0s: Pageview
10s: Social plugin
20s: Leave site

20s

10s

0s: Pageview
20s: Leave site

20s

0s

Google doesn’t want exits to influence the average time on page, because of scenarios like the third example above, where they have a time on page of 0 seconds (source). To avoid this, they use the following formula (remember that Time on Page is a total):

Average Time on Page: (Time on Page) / (Pageviews – Exits)

However, as the second example above shows, this assumption doesn’t always hold. The second example feeds into the top half of the average time on page faction, but not the bottom half:

Example 2 Average Time on Page: (20s+10s+0s) / (3-2) = 30s

There are two issues here:

  1. Overestimation
    Excluding exits from the second half of the average time on page equation doesn’t have the desired effect when their time on page wasn’t 0 seconds—note that 30s is longer than any of the individual visits. This is why average time on page can often be longer than average visit duration. Nonetheless, 30 seconds doesn’t seem too far out in the above scenario (the intuitive average is 20s), but in the real world many pages have much higher exit rates than the 67% in this example, and/or much less engagement with events on page.
  2. Ignored visits
    Considering only visitors who exit without an engagement hit, whether these visitors stayed for 2 seconds, 10 minutes or anything inbetween, it doesn’t influence average time on page in the slightest. On many sites, a 10 minute view of a single page without interaction (e.g. a blog post) would be considered a success, but it wouldn’t influence this metric.

Solution: Unfortunately, there isn’t an easy solution to this issue. If you want to use average time on page, you just need to keep in mind how it’s calculated. You could also consider setting up more engagement events on page (like a scroll event without the “nonInteraction” parameter)—this solves issue #2 above, but potentially worsens issue #1.

Site Speed

If you’ve used the Site Speed reports in Google Analytics in the past, you’ve probably noticed that the numbers can sometimes be pretty difficult to believe. This is because the way that Site Speed is tracked is extremely vulnerable to outliers—it starts with a 1% sample of your users and then takes a simple average for each metric. This means that a few extreme values (for example, the occasional user with a malware-infested computer or a questionable wifi connection) can create a very large swing in your data.

The use of an average as a metric is not in itself bad, but in an area so prone to outliers and working with such a small sample, it can lead to questionable results.

Fortunately, you can increase the sampling rate right up to 100% (or the cap of 10,000 hits per day). Depending on the size of your site, this may still only be useful for top-level data. For example, if your site gets 1,000,000 hits per day and you’re interested in the performance of a new page that’s receiving 100 hits per day, Google Analytics will throttle your sampling back to the 10,000 hits per day cap—1%. As such, you’ll only be looking at a sample of 1 hit per day for that page.

Solution: Turn up the sampling rate. If you receive more than 10,000 hits per day, keep the sampling rate in mind when digging into less visited pages. You could also consider external tools and testing, such as Pingdom or WebPagetest.

Conversion Rate (by channel)

Obviously, conversion rate is not in itself a bad metric, but it can be rather misleading in certain reports if you don’t realise that, by default, conversions are attributed using a last non-direct click attribution model.

From Google Analytics Help:

“…if a person clicks over your site from google.com, then returns as “direct” traffic to convert, Google Analytics will report 1 conversion for “google.com / organic” in All Traffic.”

This means that when you’re looking at conversion numbers in your acquisition reports, it’s quite possible that every single number is different to what you’d expect under last click—every channel other than direct has a total that includes some conversions that occurred during direct sessions, and direct itself has conversion numbers that don’t include some conversions that occurred during direct sessions.

Solution: This is just something to be aware of. If you do want to know your last-click numbers, there’s always the Multi-Channel Funnels and Attribution reports to help you out.

Exit Rate

Unlike some of the other metrics I’ve discussed here, the calculation behind exit rate is very intuitive—”for all pageviews to the page, Exit Rate is the percentage that were the last in the session.” The problem with exit rate is that it’s so often used as a negative metric: “Which pages had the highest exit rate? They’re the problem with our site!” Sometimes this might be true: Perhaps, for example, if those pages are in the middle of a checkout funnel.

Often, however, a user will exit a site when they’ve found what they want. This doesn’t just mean that a high exit rate is ok on informational pages like blog posts or about pages—it could also be true of product pages and other pages with a highly conversion-focused intent. Even on ecommerce sites, not every visitor has the intention of converting. They might be researching towards a later online purchase, or even planning to visit your physical store. This is particularly true if your site ranks well for long tail queries or is referenced elsewhere. In this case, an exit could be a sign that they found the information they wanted and are ready to purchase once they have the money, the need, the right device at hand or next time they’re passing by your shop.

Solution: When judging a page by its exit rate, think about the various possible user intents. It could be useful to take a segment of visitors who exited on a certain page (in the Advanced tab of the new segment menu), and investigate their journey in User Flow reports, or their landing page and acquisition data.

Discussion

If you know of any other similarly misunderstood metrics, you have any questions or you have something to add to my analysis, tweet me at @THCapper or leave a comment below.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it