Eliminate Duplicate Content in Faceted Navigation with Ajax/JSON/JQuery

Posted by EricEnge

One of the classic problems in SEO is that while complex navigation schemes may be useful to users, they create problems for search engines. Many publishers rely on tags such as rel=canonical, or the parameters settings in Webmaster Tools to try and solve these types of issues. However, each of the potential solutions has limitations. In today’s post, I am going to outline how you can use JavaScript solutions to more completely eliminate the problem altogether.

Note that I am not going to provide code examples in this post, but I am going to outline how it works on a conceptual level. If you are interested in learning more about Ajax/JSON/jQuery here are some resources you can check out:

  1. Ajax Tutorial
  2. Learning Ajax/jQuery

Defining the problem with faceted navigation

Having a page of products and then allowing users to sort those products the way they want (sorted from highest to lowest price), or to use a filter to pick a subset of the products (only those over $60) makes good sense for users. We typically refer to these types of navigation options as “faceted navigation.”

However, faceted navigation can cause problems for search engines because they don’t want to crawl and index all of your different sort orders or all your different filtered versions of your pages. They would end up with many different variants of your pages that are not significantly different from a search engine user experience perspective.

Solutions such as rel=canonical tags and parameters settings in Webmaster Tools have some limitations. For example, rel=canonical tags are considered “hints” by the search engines, and they may not choose to accept them, and even if they are accepted, they do not necessarily keep the search engines from continuing to crawl those pages.

A better solution might be to use JSON and jQuery to implement your faceted navigation so that a new page is not created when a user picks a filter or a sort order. Let’s take a look at how it works.

Using JSON and jQuery to filter on the client side

The main benefit of the implementation discussed below is that a new URL is not created when a user is on a page of yours and applies a filter or sort order. When you use JSON and jQuery, the entire process happens on the client device without involving your web server at all.

When a user initially requests one of the product pages on your web site, the interaction looks like this:

using json on faceted navigation

This transfers the page to the browser the user used to request the page. Now when a user picks a sort order (or filter) on that page, here is what happens:

jquery and faceted navigation diagram

When the user picks one of those options, a jQuery request is made to the JSON data object. Translation: the entire interaction happens within the client’s browser and the sort or filter is applied there. Simply put, the smarts to handle that sort or filter resides entirely within the code on the client device that was transferred with the initial request for the page.

As a result, there is no new page created and no new URL for Google or Bing to crawl. Any concerns about crawl budget or inefficient use of PageRank are completely eliminated. This is great stuff! However, there remain limitations in this implementation.

Specifically, if your list of products spans multiple pages on your site, the sorting and filtering will only be applied to the data set already transferred to the user’s browser with the initial request. In short, you may only be sorting the first page of products, and not across the entire set of products. It’s possible to have the initial JSON data object contain the full set of pages, but this may not be a good idea if the page size ends up being large. In that event, we will need to do a bit more.

What Ajax does for you

Now we are going to dig in slightly deeper and outline how Ajax will allow us to handle sorting, filtering, AND pagination. Warning: There is some tech talk in this section, but I will try to follow each technical explanation with a layman’s explanation about what’s happening.

The conceptual Ajax implementation looks like this:

ajax and faceted navigation diagram

In this structure, we are using an Ajax layer to manage the communications with the web server. Imagine that we have a set of 10 pages, the user has gotten the first page of those 10 on their device and then requests a change to the sort order. The Ajax requests a fresh set of data from the web server for your site, similar to a normal HTML transaction, except that it runs asynchronously in a separate thread.

If you don’t know what that means, the benefit is that the rest of the page can load completely while the process to capture the data that the Ajax will display is running in parallel. This will be things like your main menu, your footer links to related products, and other page elements. This can improve the perceived performance of the page.

When a user selects a different sort order, the code registers an event handler for a given object (e.g. HTML Element or other DOM objects) and then executes an action. The browser will perform the action in a different thread to trigger the event in the main thread when appropriate. This happens without needing to execute a full page refresh, only the content controlled by the Ajax refreshes.

To translate this for the non-technical reader, it just means that we can update the sort order of the page, without needing to redraw the entire page, or change the URL, even in the case of a paginated sequence of pages. This is a benefit because it can be faster than reloading the entire page, and it should make it clear to search engines that you are not trying to get some new page into their index.

Effectively, it does this within the existing Document Object Model (DOM), which you can think of as the basic structure of the documents and a spec for the way the document is accessed and manipulated.

How will Google handle this type of implementation?

For those of you who read Adam Audette’s excellent recent post on the tests his team performed on how Google reads Javascript, you may be wondering if Google will still load all these page variants on the same URL anyway, and if they will not like it.

I had the same question, so I reached out to Google’s Gary Illyes to get an answer. Here is the dialog that transpired:

Eric Enge: I’d like to ask you about using JSON and jQuery to render different sort orders and filters within the same URL. I.e. the user selects a sort order or a filter, and the content is reordered and redrawn on the page on the client site. Hence no new URL would be created. It’s effectively a way of canonicalizing the content, since each variant is a strict subset.

Then there is a second level consideration with this approach, which involves doing the same thing with pagination. I.e. you have 10 pages of products, and users still have sorting and filtering options. In order to support sorting and filtering across the entire 10 page set, you use an Ajax solution, so all of that still renders on one URL.

So, if you are on page 1, and a user executes a sort, they get that all back in that one page. However, to do this right, going to page 2 would also render on the same URL. Effectively, you are taking the 10 page set and rendering it all within one URL. This allows sorting, filtering, and pagination without needing to use canonical, noindex, prev/next, or robots.txt.

If this was not problematic for Google, the only downside is that it makes the pagination not visible to Google. Does that make sense, or is it a bad idea?

Gary Illyes
: If you have one URL only, and people have to click on stuff to see different sort orders or filters for the exact same content under that URL, then typically we would only see the default content.

If you don’t have pagination information, that’s not a problem, except we might not see the content on the other pages that are not contained in the HTML within the initial page load. The meaning of rel-prev/next is to funnel the signals from child pages (page 2, 3, 4, etc.) to the group of pages as a collection, or to the view-all page if you have one. If you simply choose to render those paginated versions on a single URL, that will have the same impact from a signals point of view, meaning that all signals will go to a single entity, rather than distributed to several URLs.

Summary

Keep in mind, the reason why Google implemented tags like rel=canonical, NoIndex, rel=prev/next, and others is to reduce their crawling burden and overall page bloat and to help focus signals to incoming pages in the best way possible. The use of Ajax/JSON/jQuery as outlined above does this simply and elegantly.

On most e-commerce sites, there are many different “facets” of how a user might want to sort and filter a list of products. With the Ajax-style implementation, this can be done without creating new pages. The end users get the control they are looking for, the search engines don’t have to deal with excess pages they don’t want to see, and signals in to the site (such as links) are focused on the main pages where they should be.

The one downside is that Google may not see all the content when it is paginated. A site that has lots of very similar products in a paginated list does not have to worry too much about Google seeing all the additional content, so this isn’t much of a concern if your incremental pages contain more of what’s on the first page. Sites that have content that is materially different on the additional pages, however, might not want to use this approach.

These solutions do require Javascript coding expertise but are not really that complex. If you have the ability to consider a path like this, you can free yourself from trying to understand the various tags, their limitations, and whether or not they truly accomplish what you are looking for.

Credit: Thanks for Clark Lefavour for providing a review of the above for technical correctness.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 3 years ago from tracking.feedpress.it

The Long Click and the Quality of Search Success

Posted by billslawski

“On the most basic level, Google could see how satisfied users were. To paraphrase Tolstoy, happy users were all the same. The best sign of their happiness was the “Long Click” — This occurred when someone went to a search result, ideally the top one, and did not return. That meant Google has successfully fulfilled the query.”

~ Steven Levy. In the Plex: How Google Thinks, Works, and Shapes our Lives

I often explore and read patents and papers from the search engines to try to get a sense of how they may approach different issues, and learn about the assumptions they make about search, searchers, and the Web. Lately, I’ve been keeping an eye open for papers and patents from the search engines where they talk about a metric known as the “long click.”

A recently granted Google patent uses the metric of a “Long Click” as the center of a process Google may use to track results for queries that were selected by searchers for long visits in a set of search results.

This concept isn’t new. In 2011, I wrote about a Yahoo patent in How a Search Engine May Measure the Quality of Its Search Results, where they discussed a metric that they refer to as a “target page success metric.” It included “dwell time” upon a result as a sign of search success (Yes, search engines have goals, too).

5543947f5bb408.24541747.jpg

Another Google patent described assigning web pages “reachability scores” based upon the quality of pages linked to from those initially visited pages. In the post Does Google Use Reachability Scores in Ranking Resources? I described how a Google patent that might view a long click metric as a sign to see if visitors to that page are engaged by the links to content they find those links pointing to, including links to videos. Google tells us in that patent that it might consider a “long click” to have been made on a video if someone watches at least half the video or 30 seconds of it. The patent suggests that a high reachability score on a page may mean that page could be boosted in Google search results.

554394a877e8c8.30299132.jpg

But the patent I’m writing about today is focused primarily upon looking at and tracking a search success metric like a long click or long dwell time. Here’s the abstract:

Modifying ranking data based on document changes

Invented by Henele I. Adams, and Hyung-Jin Kim

Assigned to Google

US Patent 9,002,867

Granted April 7, 2015

Filed: December 30, 2010

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media for determining a weighted overall quality of result statistic for a document.

One method includes receiving quality of result data for a query and a plurality of versions of a document, determining a weighted overall quality of result statistic for the document with respect to the query including weighting each version specific quality of result statistic and combining the weighted version-specific quality of result statistics, wherein each quality of result statistic is weighted by a weight determined from at least a difference between content of a reference version of the document and content of the version of the document corresponding to the version specific quality of result statistic, and storing the weighted overall quality of result statistic and data associating the query and the document with the weighted overall quality of result statistic.

This patent tells us that search results may be be ranked in an order, according to scores assigned to the search results by a scoring function or process that would be based upon things such as:

  • Where, and how often, query terms appear in the given document,
  • How common the query terms are in the documents indexed by the search engine, or
  • A query-independent measure of quality of the document itself.

Last September, I wrote about how Google might identify a category associated with a query term base upon clicks, in the post Using Query User Data To Classify Queries. In a query for “Lincoln.” the results that appear in response might be about the former US President, the town of Lincoln, Nebraska, and the model of automobile. When someone searches for [Lincoln], Google returning all three of those responses as a top result could be said to be reasonable. The patent I wrote about in that post told us that Google might collect information about “Lincoln” as a search entity, and track which category of results people clicked upon most when they performed that search, to determine what categories of pages to show other searchers. Again, that’s another “search success” based upon a past search history.

There likely is some value in working to find ways to increase the amount of dwell time someone spends upon the pages of your site, if you are already having some success in crafting page titles and snippets that persuade people to click on your pages when they those appear in search results. These approaches can include such things as:

  1. Making visiting your page a positive experience in terms of things like site speed, readability, and scannability.
  2. Making visiting your page a positive experience in terms of things like the quality of the content published on your pages including spelling, grammar, writing style, interest, quality of images, and the links you share to other resources.
  3. Providing a positive experience by offering ideas worth sharing with others, and offering opportunities for commenting and interacting with others, and by being responsive to people who do leave comments.

Here are some resources I found that discuss this long click metric in terms of “dwell time”:

Your ability to create pages that can end up in a “long click” from someone who has come to your site in response to a query, is also a “search success” metric on the search engine’s part, and you both succeed. Just be warned that as the most recent patent from Google on Long Clicks shows us, Google will be watching to make sure that the content of your page doesn’t change too much, and that people are continuing to click upon it in search results, and spend a fair amount to time upon it.

(Images for this post are from my Go Fish Digital Design Lead Devin Holmes @DevinGoFish. Thank you, Devin!)

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 3 years ago from tracking.feedpress.it

Check Your Local Business Listings in the UK

Posted by David-Mihm

One of the most consistent refrains from the Moz community as we’ve
released features over the last two years has been the desire to see Moz Local expand to countries outside the U.S. Today I’m pleased to announce that we’re embarking on our journey to global expansion with support for U.K. business listing searches in our Check Listing tool.

Some of you may remember limited U.K. functionality as part of GetListed.org, but as a very small company we couldn’t keep up with the maintenance required to present reliable results. It’s taken us longer than we would have liked to get here, but now with more resources, the Moz Local team has the bandwidth and important experience from the past year of Moz Local in the U.S. to fully support U.K. businesses.

How It Works

We’ve updated our search feature to accept both U.S. and U.K. postal codes, so just head on over to
moz.com/local/search to check it out!

After entering the name of your business and a U.K. postcode, we go out and ping Google and other important local search sites in the U.K., and return what we found. Simply select the closest-matching business and we’ll proceed to run a full audit of your listings across these sites.

You can click through and discover incomplete listings, inconsistent NAP information, duplicate listings, and more.

This check listing feature is free to all Moz community members.

You’ve no doubt noted in the screenshot above that we project a listing score improvement. We do plan to release a fully-featured U.K. version of Moz Local later this spring (with the same distribution, reporting, and duplicate-closure features that are available in the U.S.), and you can enter your email address—either on that page or right here—to be notified when we do!

.sendgrid-subscription-widget .response {
font-style: italic;
font-size: 14px;
font-weight: 300;
}

.sendgrid-subscription-widget .response.success {
color: #93e7b6;
font-size: 14px;
}

.sendgrid-subscription-widget form .response.error {
color: #fcbb4a;
font-size: 14px;
}

.sendgrid-subscription-widget form input[type=”submit”].btn {
}

.sendgrid-subscription-widget span {
display: none;
}

.sendgrid-subscription-widget form input[type=”email”] {
color: #000000;
width: 200px;
}

U.K.-Specific Partners

As I’ve mentioned in previous blog comments, there are a certain number of global data platforms (Google, Facebook, Yelp, Bing, Foursquare, and Factual, among others) where it’s valuable to be listed correctly and completely no matter which country you’re in.

But every country has its own unique set of domestically relevant players as well, and we’re pleased to have worked with two of them on this release: Central Index and Thomson Local. (Head on over to the
Moz Local Learning Center for more information about country-specific data providers.)

We’re continuing discussions with a handful of other prospective data partners in the U.K. If you’re interested in working with us, please
let us know!

What’s Next?

Requests for further expansion, especially to Canada and Australia, I’m sure will be loud and clear in the comments below! Further expansion is on our roadmap, but it’s balanced against a more complete feature set in the (more populous) U.S. and U.K. markets. We’ll continue to use our experience in those markets as we prioritize when and where to expand next.

A few lucky members of the Moz Local team are already on their way to
BrightonSEO. So if you’re attending that awesome event later this week, please stop by our booth and let us know what you’d like to see us work on next.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

SEO Dashboard Report Tour, Part 4 | SpyFu Recon

http://www.spyfu.com/recon – Continuing our SEO Progress Report journey by presenting the absolute core of SpyFu Recon Files. Impress your client.

Reblogged 4 years ago from www.youtube.com

How to Defeat Duplicate Content – Next Level

Posted by EllieWilkinson

Welcome to the third installment of Next Level! In the previous Next Level blog post, we shared a workflow showing you how to take on your competitors using Moz tools. We’re continuing the educational series with several new videos all about resolving duplicate content. Read on and level up!


Dealing with duplicate content can feel a bit like doing battle with your site’s evil doppelgänger—confusing and tricky to defeat! But identifying and resolving duplicates is a necessary part of helping search engines decide on relevant results. In this short video, learn about how duplicate content happens, why it’s important to fix, and a bit about how you can uncover it.

Next Level – Identifying Duplicate_pt1

[
Quick clarification: Search engines don’t actively penalize duplicate content, per se; they just don’t always understand it as well, which can lead to a drop in rankings. More info here.]

Now that you have a better idea of how to identify those dastardly duplicates, let’s get rid of ’em once and for all. Watch this next video to review how to use Moz Analytics to find and fix duplicate content using three common solutions. (You’ll need a Moz Pro subscription to use Moz Analytics. If you aren’t yet a Moz Pro subscriber, you can always try out the tools with a
30-day free trial.)

Workflow summary

Here’s a review of the three common solutions to conquering duplicate content:

  1. 301 redirect. Check Page Authority to see if one page has a higher PA than the other using Open Site Explorer, then set up a 301 redirect from the duplicate page to the original page. This will ensure that they no longer compete with one another in the search results. Wondering what a 301 redirect is and how to do it? Read more about redirection here.
  2. Rel=canonical. A rel=canonical tag passes the same amount of ranking power as a 301 redirect, and there’s a bonus: it often takes less development time to implement! Add this tag to the HTML head of a web page to tell search engines that it should be treated as a copy of the “canon,” or original, page:
    <head> <link rel="canonical" href="http://moz.com/blog/" /> </head>

    If you’re curious, you can
    read more about canonicalization here.

  3. noindex, follow. Add the values “noindex, follow” to the meta robots tag to tell search engines not to include the duplicate pages in their indexes, but to crawl their links. This works really well with paginated content or if you have a system set up to tag or categorize content (as with a blog). Here’s what it should look like:
    <head> <meta name="robots" content="noindex, follow" /> </head>

    If you’re looking to block the Moz crawler, Rogerbot, you can use the robots.txt file if you prefer—he’s a good robot, and he’ll obey!
    More about meta robots (and robots.txt) here.

Can’t get enough of duplicate content? Want to become a duplicate content connoisseur? This last video explains more about how Moz finds duplicates, if you’re curious. And you can read even more over at the
Moz Developer Blog.

We’d love to hear about your techniques for defeating duplicates! Chime in below in the comments.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Hacking Keyword Targeting by Serving Interest-Based Searches – Whiteboard Friday

Posted by randfish

Depending on your industry, the more obvious and conversion-focused keywords you might target could be few and far between. With Google continuing to evolve, though, there’s a whole host of other areas you might look: interest-based keywords. In today’s Whiteboard Friday, Rand shows you how to find them.

For reference, here’s a still of this week’s whiteboard!

Video transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re chatting about keyword targeting and specifically some of the challenges that happen when your keyword targeting list is rather small or hyper competitive and you need to broaden out. One of the great ways you can do that is actually by hacking the interests of the people who are performing those searches, or might perform those searches in the future, or might never perform those searches, but are actually interested in the product or service that you have to offer.

Classic, traditional keyword research is all about focusing on the product or service’s purchase intent. Meaning, here’s let’s say Charles over here. Charles needs to better track his fitness. He knows that what he’d like to able to do is get some tools to track his fitness. Maybe he’s looking at a Fitbit or something like that.

When we, doing marketing to Charles, have a fitness tracking product or a piece of software or a piece of hardware to offer him, we’re thinking about terms like fitness tracking software, track weight loss, workout measurement, and monitor workout progress, very direct, very obvious kinds of search terms that are clearly going to lead Charles from his intent right over to our website.

This is perfect keyword targeting keyword research if you’re doing paid search, because with paid search you need a return on that investment right away. You don’t want to be bidding on keywords, generally speaking, that are not going to directly bring you sign-ups, conversions, potential costumers.

This is not so true, however, when it comes to SEO. A lot of times when folks look at their SEO campaigns, they go, “Man, the list of keywords that I could target that really say expressly I want a fitness tracking piece of software or a fitness tracking piece of hardware is not that long. Therefore, what else should I create? What other terms could I potentially go after?” That’s where you want to do a little bit more of what social display and retargeting does, which is to think about reaching people based on their interests, their attributes, and the actions that they’ve taken.

If you go to Facebook and you do some ad targeting there, it’s not based on, hey, Charles expressly did a search for fitness tracking software. But you can go and find all the people who’ve labeled fitness as an interest of theirs. You can then further refine by demographics and psychographics, job, location, income, and all these other attributes.

This is what you can do in, for example, Google’s Display Planner as well. You can look at I want all the people who’ve read articles on MensHealth.com. Or you can get even more specific with some kinds of advertising and say, “I only want to advertise in front of people who looked at articles specifically on cross training, because we happen to know that maybe that’s that best target group for us.”

This is a very cool process too. But in SEO we can actually merge these two things. We can put them together, and a lot of smart SEOs do this. They combine these two practices in their keyword research and targeting. They find people who like fitness, and then they talk to them. They ask them questions. This can be implicit, explicit. This can be through surveys. This can be through interviews. You kind of sit down, and you’re like, “Okay, that’s really awesome. Can you tell me more about what inspired your love for fitness? Tell me about the content that you looked at prior to this. Tell me about books that you read, people that influenced you, all those kinds of things.”

You’re trying to gather that information, those subjects of interest. Not just fitness, but other things that they touch on. Content that they may have found or liked before learning that they wanted to track their fitness progress. Websites that they frequently visit. People and brands or accounts that they follow on social media. Who are their influencers?

We learn all this, and now we have kind of this topic set for pre-interest keyword research. Pre-interest, meaning, before the party is actually interested in the product or service or solution that we provide, what are they interested in? We can do keyword research and targeting based on those things.

What’s awesome about this is it’s like potentially much lower competition, earlier brand exposure, which means that all of our others efforts that are targeting them further down the funnel are likely to be more effective because they’ve already been exposed to our brand. They know us. Hopefully, they like us already.

This is huge for content marketing. Very rich content opportunities. Usually, content marketing opportunities and content creation opportunities that aren’t just purely self-promotional either. You go and create content about this and you’re a fitness tracking company, well, that’s pretty typical. That’s to be expected. It’s going to be self-promotional whether it’s explicitly promotional or not.

But this type of content is very different. This type of content is all about promoting a movement or promoting information about a topic that you know potentially your subjects will have interest in, in the future, and because of that it’s much easier to promote and share without being perceived as prideful and self-promotional, which tamps down a lot of the sharing that you could get.

Instead of things like fitness tracking software, I’m going to get running trails, comparison of cross trainer sneakers, strength training exercises, healthy meals for muscle growth. Awesome.

This is really cool. This process is what you want to use in that keyword research and brainstorming. Start before you get bogged down into, hey, these are the only terms and phrases that we can target because these are the only things that express intent.

Sometimes this might cross over into PPC. Most of the time this is really useful for SEO and content creation.

All right, everyone, I look forward to seeing some tools, tactics, and tips from all of you in the comments. We’ll catch you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from moz.com