Is Australia the land of opportunity for your retail brand?

Australia has a resident population of more than 24 million and, according to eMarketer, the country’s ecommerce sales are predicted to reach A$32.56 billion by 2017. The country’s remote location in the APAC region means that unlike European countries or the USA, traditionally there have been a lack of global brands sold locally.

Of course, we also know that many expatriates, particularly from inside the Commonwealth, have made Australia their home and are keen to buy products they know and love from their country of origin.

All of these factors present a huge and potentially lucrative opportunity for non-Australian brands wanting to open up their new and innovative products to a fresh market, or compete for market share.

But it’s not just non-Australian retailers who are at an advantage here: Australia was late to the ecommerce party because native, established brands were trading well without it. Subsequently, Australian retailers’ ecommerce technology stacks are much more recent and not burdened by legacy systems. This makes it much easier to extend, or get started with, best-of-breed technologies and cash in on a market that’s booming. To put some of this into perspective, Magento’s innovative ecommerce platform currently takes 42% of Australia’s market share and the world’s first adopter of Magento 2.0 was an Australian brand.

The GST loophole

At the moment, local retailers are campaigning against a rule that exempts foreign websites from being charged a 10% general sales tax (GST) on purchases under A$1,000. And in 2013, Australian consumers made $3.11 billion worth of purchases under A$1,000.[1]

While the current GST break appears to put non-Australian retailers at an advantage, Australian-based brands such as Harvey Norman are using it to their advantage by setting up ecommerce operations in Asia to enjoy the GST benefit.

Australian consumers have also countered the argument by saying that price isn’t always the motivator when it comes to making purchasing decisions.

It’s not a place where no man has gone before

Often, concerns around meeting local compliance and lack of overseas business knowledge prevent outsiders from taking the leap into cross-border trade. However, this ecommerce passport, created by Ecommerce Worldwide and NORA, is designed to support those considering selling in Australia. The guide provides a comprehensive look into everything from the country’s economy and trade status, to logistics and dealing with international payments.

Global expansion success stories are also invaluable sources of information. For instance, it’s not just lower-end retailers that are fitting the bill, with brands like online luxury fashion retailer Net-a-Porter naming Australia as one of its biggest markets.

How tech-savvy are the Aussies?

One of the concerns you might have as a new entrant into the market is how you’ll reach and sell to your new audience, particularly without having a physical presence. The good news is that more than 80% of the country is digitally enabled and 60% of mobile phone users own a smartphone – so online is deeply rooted into the majority of Australians’ lives. [2]

Marketing your brand

Heard the saying “Fire bullets then fire cannonballs”? In any case, you’ll want to test the waters and gauge people’s reactions to your product or service.

It all starts with the website because, without it, you’re not discoverable or searchable, and you’ve nowhere to drive people to when running campaigns. SEO and SEM should definitely be a priority, and an online store that can handle multiple regions and storefronts, like Magento, will make your life easier. A mobile-first mentality and well thought-out UX will also place you in a good position.

Once your new web store is set up, you should be making every effort to collect visitors’ email addresses, perhaps via a popover. Why? Firstly, email is one of the top three priority areas for Australian retailers, because it’s a cost-effective, scalable marketing channel that enables true personalization.

Secondly, email marketing automation empowers you to deliver the customer experience today’s consumer expects, as well as enabling you to communicate with them throughout the lifecycle. Check out our ‘Do customer experience masters really exist?’ whitepaper for some real-life success stories.

Like the Magento platform, dotmailer is set up to handle multiple languages, regions and accounts, and is designed to grow with you.

In summary, there’s great scope for ecommerce success in Australia, whether you’re a native bricks-and-mortar retailer, a start-up or a non-Australian merchant. The barriers to cross-border trade are falling and Australia is one of APAC’s most developed regions in terms of purchasing power and tech savviness.

We recently worked with ecommerce expert Chloe Thomas to produce a whitepaper on cross-border trade, which goes into much more detail on how to market and sell successfully in new territories. You can download a free copy here.

[1] Australian Passport 2015: Cross-Border Trading Report

[2] Australian Passport 2015: Cross-Border Trading Report

Reblogged 3 years ago from blog.dotmailer.com

Stop Ghost Spam in Google Analytics with One Filter

Posted by CarloSeo

The spam in Google Analytics (GA) is becoming a serious issue. Due to a deluge of referral spam from social buttons, adult sites, and many, many other sources, people are starting to become overwhelmed by all the filters they are setting up to manage the useless data they are receiving.

The good news is, there is no need to panic. In this post, I’m going to focus on the most common mistakes people make when fighting spam in GA, and explain an efficient way to prevent it.

But first, let’s make sure we understand how spam works. A couple of months ago, Jared Gardner wrote an excellent article explaining what referral spam is, including its intended purpose. He also pointed out some great examples of referral spam.

Types of spam

The spam in Google Analytics can be categorized by two types: ghosts and crawlers.

Ghosts

The vast majority of spam is this type. They are called ghosts because they never access your site. It is important to keep this in mind, as it’s key to creating a more efficient solution for managing spam.

As unusual as it sounds, this type of spam doesn’t have any interaction with your site at all. You may wonder how that is possible since one of the main purposes of GA is to track visits to our sites.

They do it by using the Measurement Protocol, which allows people to send data directly to Google Analytics’ servers. Using this method, and probably randomly generated tracking codes (UA-XXXXX-1) as well, the spammers leave a “visit” with fake data, without even knowing who they are hitting.

Crawlers

This type of spam, the opposite to ghost spam, does access your site. As the name implies, these spam bots crawl your pages, ignoring rules like those found in robots.txt that are supposed to stop them from reading your site. When they exit your site, they leave a record on your reports that appears similar to a legitimate visit.

Crawlers are harder to identify because they know their targets and use real data. But it is also true that new ones seldom appear. So if you detect a referral in your analytics that looks suspicious, researching it on Google or checking it against this list might help you answer the question of whether or not it is spammy.

Most common mistakes made when dealing with spam in GA

I’ve been following this issue closely for the last few months. According to the comments people have made on my articles and conversations I’ve found in discussion forums, there are primarily three mistakes people make when dealing with spam in Google Analytics.

Mistake #1. Blocking ghost spam from the .htaccess file

One of the biggest mistakes people make is trying to block Ghost Spam from the .htaccess file.

For those who are not familiar with this file, one of its main functions is to allow/block access to your site. Now we know that ghosts never reach your site, so adding them here won’t have any effect and will only add useless lines to your .htaccess file.

Ghost spam usually shows up for a few days and then disappears. As a result, sometimes people think that they successfully blocked it from here when really it’s just a coincidence of timing.

Then when the spammers later return, they get worried because the solution is not working anymore, and they think the spammer somehow bypassed the barriers they set up.

The truth is, the .htaccess file can only effectively block crawlers such as buttons-for-website.com and a few others since these access your site. Most of the spam can’t be blocked using this method, so there is no other option than using filters to exclude them.

Mistake #2. Using the referral exclusion list to stop spam

Another error is trying to use the referral exclusion list to stop the spam. The name may confuse you, but this list is not intended to exclude referrals in the way we want to for the spam. It has other purposes.

For example, when a customer buys something, sometimes they get redirected to a third-party page for payment. After making a payment, they’re redirected back to you website, and GA records that as a new referral. It is appropriate to use referral exclusion list to prevent this from happening.

If you try to use the referral exclusion list to manage spam, however, the referral part will be stripped since there is no preexisting record. As a result, a direct visit will be recorded, and you will have a bigger problem than the one you started with since. You will still have spam, and direct visits are harder to track.

Mistake #3. Worrying that bounce rate changes will affect rankings

When people see that the bounce rate changes drastically because of the spam, they start worrying about the impact that it will have on their rankings in the SERPs.

bounce.png

This is another mistake commonly made. With or without spam, Google doesn’t take into consideration Google Analytics metrics as a ranking factor. Here is an explanation about this from Matt Cutts, the former head of Google’s web spam team.

And if you think about it, Cutts’ explanation makes sense; because although many people have GA, not everyone uses it.

Assuming your site has been hacked

Another common concern when people see strange landing pages coming from spam on their reports is that they have been hacked.

landing page

The page that the spam shows on the reports doesn’t exist, and if you try to open it, you will get a 404 page. Your site hasn’t been compromised.

But you have to make sure the page doesn’t exist. Because there are cases (not spam) where some sites have a security breach and get injected with pages full of bad keywords to defame the website.

What should you worry about?

Now that we’ve discarded security issues and their effects on rankings, the only thing left to worry about is your data. The fake trail that the spam leaves behind pollutes your reports.

It might have greater or lesser impact depending on your site traffic, but everyone is susceptible to the spam.

Small and midsize sites are the most easily impacted – not only because a big part of their traffic can be spam, but also because usually these sites are self-managed and sometimes don’t have the support of an analyst or a webmaster.

Big sites with a lot of traffic can also be impacted by spam, and although the impact can be insignificant, invalid traffic means inaccurate reports no matter the size of the website. As an analyst, you should be able to explain what’s going on in even in the most granular reports.

You only need one filter to deal with ghost spam

Usually it is recommended to add the referral to an exclusion filter after it is spotted. Although this is useful for a quick action against the spam, it has three big disadvantages.

  • Making filters every week for every new spam detected is tedious and time-consuming, especially if you manage many sites. Plus, by the time you apply the filter, and it starts working, you already have some affected data.
  • Some of the spammers use direct visits along with the referrals.
  • These direct hits won’t be stopped by the filter so even if you are excluding the referral you will sill be receiving invalid traffic, which explains why some people have seen an unusual spike in direct traffic.

Luckily, there is a good way to prevent all these problems. Most of the spam (ghost) works by hitting GA’s random tracking-IDs, meaning the offender doesn’t really know who is the target, and for that reason either the hostname is not set or it uses a fake one. (See report below)

Ghost-Spam.png

You can see that they use some weird names or don’t even bother to set one. Although there are some known names in the list, these can be easily added by the spammer.

On the other hand, valid traffic will always use a real hostname. In most of the cases, this will be the domain. But it also can also result from paid services, translation services, or any other place where you’ve inserted GA tracking code.

Valid-Referral.png

Based on this, we can make a filter that will include only hits that use real hostnames. This will automatically exclude all hits from ghost spam, whether it shows up as a referral, keyword, or pageview; or even as a direct visit.

To create this filter, you will need to find the report of hostnames. Here’s how:

  1. Go to the Reporting tab in GA
  2. Click on Audience in the lefthand panel
  3. Expand Technology and select Network
  4. At the top of the report, click on Hostname

Valid-list

You will see a list of all hostnames, including the ones that the spam uses. Make a list of all the valid hostnames you find, as follows:

  • yourmaindomain.com
  • blog.yourmaindomain.com
  • es.yourmaindomain.com
  • payingservice.com
  • translatetool.com
  • anotheruseddomain.com

For small to medium sites, this list of hostnames will likely consist of the main domain and a couple of subdomains. After you are sure you got all of them, create a regular expression similar to this one:

yourmaindomain\.com|anotheruseddomain\.com|payingservice\.com|translatetool\.com

You don’t need to put all of your subdomains in the regular expression. The main domain will match all of them. If you don’t have a view set up without filters, create one now.

Then create a Custom Filter.

Make sure you select INCLUDE, then select “Hostname” on the filter field, and copy your expression into the Filter Pattern box.

filter

You might want to verify the filter before saving to check that everything is okay. Once you’re ready, set it to save, and apply the filter to all the views you want (except the view without filters).

This single filter will get rid of future occurrences of ghost spam that use invalid hostnames, and it doesn’t require much maintenance. But it’s important that every time you add your tracking code to any service, you add it to the end of the filter.

Now you should only need to take care of the crawler spam. Since crawlers access your site, you can block them by adding these lines to the .htaccess file:

## STOP REFERRER SPAM 
RewriteCond %{HTTP_REFERER} semalt\.com [NC,OR] 
RewriteCond %{HTTP_REFERER} buttons-for-website\.com [NC] 
RewriteRule .* - [F]

It is important to note that this file is very sensitive, and misplacing a single character it it can bring down your entire site. Therefore, make sure you create a backup copy of your .htaccess file prior to editing it.

If you don’t feel comfortable messing around with your .htaccess file, you can alternatively make an expression with all the crawlers, then and add it to an exclude filter by Campaign Source.

Implement these combined solutions, and you will worry much less about spam contaminating your analytics data. This will have the added benefit of freeing up more time for you to spend actually analyze your valid data.

After stopping spam, you can also get clean reports from the historical data by using the same expressions in an Advance Segment to exclude all the spam.

Bonus resources to help you manage spam

If you still need more information to help you understand and deal with the spam on your GA reports, you can read my main article on the subject here: http://www.ohow.co/what-is-referrer-spam-how-stop-it-guide/.

Additional information on how to stop spam can be found at these URLs:

In closing, I am eager to hear your ideas on this serious issue. Please share them in the comments below.

(Editor’s Note: All images featured in this post were created by the author.)

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Distance from Perfect

Posted by wrttnwrd

In spite of all the advice, the strategic discussions and the conference talks, we Internet marketers are still algorithmic thinkers. That’s obvious when you think of SEO.

Even when we talk about content, we’re algorithmic thinkers. Ask yourself: How many times has a client asked you, “How much content do we need?” How often do you still hear “How unique does this page need to be?”

That’s 100% algorithmic thinking: Produce a certain amount of content, move up a certain number of spaces.

But you and I know it’s complete bullshit.

I’m not suggesting you ignore the algorithm. You should definitely chase it. Understanding a little bit about what goes on in Google’s pointy little head helps. But it’s not enough.

A tale of SEO woe that makes you go “whoa”

I have this friend.

He ranked #10 for “flibbergibbet.” He wanted to rank #1.

He compared his site to the #1 site and realized the #1 site had five hundred blog posts.

“That site has five hundred blog posts,” he said, “I must have more.”

So he hired a few writers and cranked out five thousand blogs posts that melted Microsoft Word’s grammar check. He didn’t move up in the rankings. I’m shocked.

“That guy’s spamming,” he decided, “I’ll just report him to Google and hope for the best.”

What happened? Why didn’t adding five thousand blog posts work?

It’s pretty obvious: My, uh, friend added nothing but crap content to a site that was already outranked. Bulk is no longer a ranking tactic. Google’s very aware of that tactic. Lots of smart engineers have put time into updates like Panda to compensate.

He started like this:

And ended up like this:
more posts, no rankings

Alright, yeah, I was Mr. Flood The Site With Content, way back in 2003. Don’t judge me, whippersnappers.

Reality’s never that obvious. You’re scratching and clawing to move up two spots, you’ve got an overtasked IT team pushing back on changes, and you’ve got a boss who needs to know the implications of every recommendation.

Why fix duplication if rel=canonical can address it? Fixing duplication will take more time and cost more money. It’s easier to paste in one line of code. You and I know it’s better to fix the duplication. But it’s a hard sell.

Why deal with 302 versus 404 response codes and home page redirection? The basic user experience remains the same. Again, we just know that a server should return one home page without any redirects and that it should send a ‘not found’ 404 response if a page is missing. If it’s going to take 3 developer hours to reconfigure the server, though, how do we justify it? There’s no flashing sign reading “Your site has a problem!”

Why change this thing and not that thing?

At the same time, our boss/client sees that the site above theirs has five hundred blog posts and thousands of links from sites selling correspondence MBAs. So they want five thousand blog posts and cheap links as quickly as possible.

Cue crazy music.

SEO lacks clarity

SEO is, in some ways, for the insane. It’s an absurd collection of technical tweaks, content thinking, link building and other little tactics that may or may not work. A novice gets exposed to one piece of crappy information after another, with an occasional bit of useful stuff mixed in. They create sites that repel search engines and piss off users. They get more awful advice. The cycle repeats. Every time it does, best practices get more muddled.

SEO lacks clarity. We can’t easily weigh the value of one change or tactic over another. But we can look at our changes and tactics in context. When we examine the potential of several changes or tactics before we flip the switch, we get a closer balance between algorithm-thinking and actual strategy.

Distance from perfect brings clarity to tactics and strategy

At some point you have to turn that knowledge into practice. You have to take action based on recommendations, your knowledge of SEO, and business considerations.

That’s hard when we can’t even agree on subdomains vs. subfolders.

I know subfolders work better. Sorry, couldn’t resist. Let the flaming comments commence.

To get clarity, take a deep breath and ask yourself:

“All other things being equal, will this change, tactic, or strategy move my site closer to perfect than my competitors?”

Breaking it down:

“Change, tactic, or strategy”

A change takes an existing component or policy and makes it something else. Replatforming is a massive change. Adding a new page is a smaller one. Adding ALT attributes to your images is another example. Changing the way your shopping cart works is yet another.

A tactic is a specific, executable practice. In SEO, that might be fixing broken links, optimizing ALT attributes, optimizing title tags or producing a specific piece of content.

A strategy is a broader decision that’ll cause change or drive tactics. A long-term content policy is the easiest example. Shifting away from asynchronous content and moving to server-generated content is another example.

“Perfect”

No one knows exactly what Google considers “perfect,” and “perfect” can’t really exist, but you can bet a perfect web page/site would have all of the following:

  1. Completely visible content that’s perfectly relevant to the audience and query
  2. A flawless user experience
  3. Instant load time
  4. Zero duplicate content
  5. Every page easily indexed and classified
  6. No mistakes, broken links, redirects or anything else generally yucky
  7. Zero reported problems or suggestions in each search engines’ webmaster tools, sorry, “Search Consoles”
  8. Complete authority through immaculate, organically-generated links

These 8 categories (and any of the other bazillion that probably exist) give you a way to break down “perfect” and help you focus on what’s really going to move you forward. These different areas may involve different facets of your organization.

Your IT team can work on load time and creating an error-free front- and back-end. Link building requires the time and effort of content and outreach teams.

Tactics for relevant, visible content and current best practices in UX are going to be more involved, requiring research and real study of your audience.

What you need and what resources you have are going to impact which tactics are most realistic for you.

But there’s a basic rule: If a website would make Googlebot swoon and present zero obstacles to users, it’s close to perfect.

“All other things being equal”

Assume every competing website is optimized exactly as well as yours.

Now ask: Will this [tactic, change or strategy] move you closer to perfect?

That’s the “all other things being equal” rule. And it’s an incredibly powerful rubric for evaluating potential changes before you act. Pretend you’re in a tie with your competitors. Will this one thing be the tiebreaker? Will it put you ahead? Or will it cause you to fall behind?

“Closer to perfect than my competitors”

Perfect is great, but unattainable. What you really need is to be just a little perfect-er.

Chasing perfect can be dangerous. Perfect is the enemy of the good (I love that quote. Hated Voltaire. But I love that quote). If you wait for the opportunity/resources to reach perfection, you’ll never do anything. And the only way to reduce distance from perfect is to execute.

Instead of aiming for pure perfection, aim for more perfect than your competitors. Beat them feature-by-feature, tactic-by-tactic. Implement strategy that supports long-term superiority.

Don’t slack off. But set priorities and measure your effort. If fixing server response codes will take one hour and fixing duplication will take ten, fix the response codes first. Both move you closer to perfect. Fixing response codes may not move the needle as much, but it’s a lot easier to do. Then move on to fixing duplicates.

Do the 60% that gets you a 90% improvement. Then move on to the next thing and do it again. When you’re done, get to work on that last 40%. Repeat as necessary.

Take advantage of quick wins. That gives you more time to focus on your bigger solutions.

Sites that are “fine” are pretty far from perfect

Google has lots of tweaks, tools and workarounds to help us mitigate sub-optimal sites:

  • Rel=canonical lets us guide Google past duplicate content rather than fix it
  • HTML snapshots let us reveal content that’s delivered using asynchronous content and JavaScript frameworks
  • We can use rel=next and prev to guide search bots through outrageously long pagination tunnels
  • And we can use rel=nofollow to hide spammy links and banners

Easy, right? All of these solutions may reduce distance from perfect (the search engines don’t guarantee it). But they don’t reduce it as much as fixing the problems.
Just fine does not equal fixed

The next time you set up rel=canonical, ask yourself:

“All other things being equal, will using rel=canonical to make up for duplication move my site closer to perfect than my competitors?”

Answer: Not if they’re using rel=canonical, too. You’re both using imperfect solutions that force search engines to crawl every page of your site, duplicates included. If you want to pass them on your way to perfect, you need to fix the duplicate content.

When you use Angular.js to deliver regular content pages, ask yourself:

“All other things being equal, will using HTML snapshots instead of actual, visible content move my site closer to perfect than my competitors?”

Answer: No. Just no. Not in your wildest, code-addled dreams. If I’m Google, which site will I prefer? The one that renders for me the same way it renders for users? Or the one that has to deliver two separate versions of every page?

When you spill banner ads all over your site, ask yourself…

You get the idea. Nofollow is better than follow, but banner pollution is still pretty dang far from perfect.

Mitigating SEO issues with search engine-specific tools is “fine.” But it’s far, far from perfect. If search engines are forced to choose, they’ll favor the site that just works.

Not just SEO

By the way, distance from perfect absolutely applies to other channels.

I’m focusing on SEO, but think of other Internet marketing disciplines. I hear stuff like “How fast should my site be?” (Faster than it is right now.) Or “I’ve heard you shouldn’t have any content below the fold.” (Maybe in 2001.) Or “I need background video on my home page!” (Why? Do you have a reason?) Or, my favorite: “What’s a good bounce rate?” (Zero is pretty awesome.)

And Internet marketing venues are working to measure distance from perfect. Pay-per-click marketing has the quality score: A codified financial reward applied for seeking distance from perfect in as many elements as possible of your advertising program.

Social media venues are aggressively building their own forms of graphing, scoring and ranking systems designed to separate the good from the bad.

Really, all marketing includes some measure of distance from perfect. But no channel is more influenced by it than SEO. Instead of arguing one rule at a time, ask yourself and your boss or client: Will this move us closer to perfect?

Hell, you might even please a customer or two.

One last note for all of the SEOs in the crowd. Before you start pointing out edge cases, consider this: We spend our days combing Google for embarrassing rankings issues. Every now and then, we find one, point, and start yelling “SEE! SEE!!!! THE GOOGLES MADE MISTAKES!!!!” Google’s got lots of issues. Screwing up the rankings isn’t one of them.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

5 Spreadsheet Tips for Manual Link Audits

Posted by MarieHaynes

Link auditing is the part of my job that I love the most. I have audited a LOT of links over the last few years. While there are some programs out there that can be quite helpful to the avid link auditor, I still prefer to create a spreadsheet of my links in Excel and then to audit those links one-by-one from within Google Spreadsheets. Over the years I have learned a few tricks and formulas that have helped me in this process. In this article, I will share several of these with you.

Please know that while I am quite comfortable being labelled a link auditing expert, I am not an Excel wizard. I am betting that some of the things that I am doing could be improved upon if you’re an advanced user. As such, if you have any suggestions or tips of your own I’d love to hear them in the comments section!

1. Extract the domain or subdomain from a URL

OK. You’ve downloaded links from as many sources as possible and now you want to manually visit and evaluate one link from every domain. But, holy moly, some of these domains can have THOUSANDS of links pointing to the site. So, let’s break these down so that you are just seeing one link from each domain. The first step is to extract the domain or subdomain from each url.

I am going to show you examples from a Google spreadsheet as I find that these display nicer for demonstration purposes. However, if you’ve got a fairly large site, you’ll find that the spreadsheets are easier to create in Excel. If you’re confused about any of these steps, check out the animated gif at the end of each step to see the process in action.

Here is how you extract a domain or subdomain from a url:

  • Create a new column to the left of your url column.
  • Use this formula:

    =LEFT(B1,FIND(“/”,B1,9)-1)

    What this will do is remove everything after the trailing slash following the domain name. http://www.example.com/article.html will now become http://www.example.com and http://www.subdomain.example.com/article.html will now become http://www.subdomain.example.com.

  • Copy our new column A and paste it right back where it was using the “paste as values” function. If you don’t do this, you won’t be able to use the Find and Replace feature.
  • Use Find and Replace to replace each of the following with a blank (i.e. nothing):
    http://
    https://
    www.

And BOOM! We are left with a column that contains just domain names and subdomain names. This animated gif shows each of the steps we just outlined:

2. Just show one link from each domain

The next step is to filter this list so that we are just seeing one link from each domain. If you are manually reviewing links, there’s usually no point in reviewing every single link from every domain. I will throw in a word of caution here though. Sometimes a domain can have both a good link and a bad link pointing to you. Or in some cases, you may find that links from one page are followed and from another page on the same site they are nofollowed. You can miss some of these by just looking at one link from each domain. Personally, I have some checks built in to my process where I use Scrapebox and some internal tools that I have created to make sure that I’m not missing the odd link by just looking at one link from each domain. For most link audits, however, you are not going to miss very much by assessing one link from each domain.

Here’s how we do it:

  • Highlight our domains column and sort the column in alphabetical order.
  • Create a column to the left of our domains, so that the domains are in column B.
  • Use this formula:

    =IF(B1=B2,”duplicate”,”unique”)

  • Copy that formula down the column.
  • Use the filter function so that you are just seeing the duplicates.
  • Delete those rows. Note: If you have tens of thousands of rows to delete, the spreadsheet may crash. A workaround here is to use “Clear Rows” instead of “Delete Rows” and then sort your domains column from A-Z once you are finished.

We’ve now got a list of one link from every domain linking to us.

Here’s the gif that shows each of these steps:

You may wonder why I didn’t use Excel’s dedupe function to simply deduplicate these entries. I have found that it doesn’t take much deduplication to crash Excel, which is why I do this step manually.

3. Finding patterns FTW!

Sometimes when you are auditing links, you’ll find that unnatural links have patterns. I LOVE when I see these, because sometimes I can quickly go through hundreds of links without having to check each one manually. Here is an example. Let’s say that your website has a bunch of spammy directory links. As you’re auditing you notice patterns such as one of these:

  • All of these directory links come from a url that contains …/computers/internet/item40682/
  • A whole bunch of spammy links that all come from a particular free subdomain like blogspot, wordpress, weebly, etc.
  • A lot of links that all contain a particular keyword for anchor text (this is assuming you’ve included anchor text in your spreadsheet when making it.)

You can quickly find all of these links and mark them as “disavow” or “keep” by doing the following:

  • Create a new column. In my example, I am going to create a new column in Column C and look for patterns in urls that are in Column B.
  • Use this formula:

    =FIND(“/item40682”,B1)
    (You would replace “item40682” with the phrase that you are looking for.)

  • Copy this formula down the column.
  • Filter your new column so that you are seeing any rows that have a number in this column. If the phrase doesn’t exist in that url, you’ll see “N/A”, and we can ignore those.
  • Now you can mark these all as disavow

4. Check your disavow file

This next tip is one that you can use to check your disavow file across your list of domains that you want to audit. The goal here is to see which links you have disavowed so that you don’t waste time reassessing them. This particular tip only works for checking links that you have disavowed on the domain level.

The first thing you’ll want to do is download your current disavow file from Google. For some strange reason, Google gives you the disavow file in CSV format. I have never understood this because they want you to upload the file in .txt. Still, I guess this is what works best for Google. All of your entries will be in column A of the CSV:

What we are going to do now is add these to a new sheet on our current spreadsheet and use a VLOOKUP function to mark which of our domains we have disavowed.

Here are the steps:

  • Create a new sheet on your current spreadsheet workbook.
  • Copy and paste column A from your disavow spreadsheet onto this new sheet. Or, alternatively, use the import function to import the entire CSV onto this sheet.
  • In B1, write “previously disavowed” and copy this down the entire column.
  • Remove the “domain:” from each of the entries by doing a Find and Replace to replace domain: with a blank.
  • Now go back to your link audit spreadsheet. If your domains are in column A and if you had, say, 1500 domains in your disavow file, your formula would look like this:

    =VLOOKUP(A1,Sheet2!$A$1:$B$1500,2,FALSE)

When you copy this formula down the spreadsheet, it will check each of your domains, and if it finds the domain in Sheet 2, it will write “previously disavowed” on our link audit spreadsheet.

Here is a gif that shows the process:

5. Make monthly or quarterly disavow work easier

That same formula described above is a great one to use if you are doing regular repeated link audits. In this case, your second sheet on your spreadsheet would contain domains that you have previously audited, and column B of this spreadsheet would say, “previously audited” rather than “previously disavowed“.

Your tips?

These are just a few of the formulas that you can use to help make link auditing work easier. But there are lots of other things you can do with Excel or Google Sheets to help speed up the process as well. If you have some tips to add, leave a comment below. Also, if you need clarification on any of these tips, I’m happy to answer questions in the comments section.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Should I Use Relative or Absolute URLs? – Whiteboard Friday

Posted by RuthBurrReedy

It was once commonplace for developers to code relative URLs into a site. There are a number of reasons why that might not be the best idea for SEO, and in today’s Whiteboard Friday, Ruth Burr Reedy is here to tell you all about why.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

Let’s discuss some non-philosophical absolutes and relatives

Howdy, Moz fans. My name is Ruth Burr Reedy. You may recognize me from such projects as when I used to be the Head of SEO at Moz. I’m now the Senior SEO Manager at BigWing Interactive in Oklahoma City. Today we’re going to talk about relative versus absolute URLs and why they are important.

At any given time, your website can have several different configurations that might be causing duplicate content issues. You could have just a standard http://www.example.com. That’s a pretty standard format for a website.

But the main sources that we see of domain level duplicate content are when the non-www.example.com does not redirect to the www or vice-versa, and when the HTTPS versions of your URLs are not forced to resolve to HTTP versions or, again, vice-versa. What this can mean is if all of these scenarios are true, if all four of these URLs resolve without being forced to resolve to a canonical version, you can, in essence, have four versions of your website out on the Internet. This may or may not be a problem.

It’s not ideal for a couple of reasons. Number one, duplicate content is a problem because some people think that duplicate content is going to give you a penalty. Duplicate content is not going to get your website penalized in the same way that you might see a spammy link penalty from Penguin. There’s no actual penalty involved. You won’t be punished for having duplicate content.

The problem with duplicate content is that you’re basically relying on Google to figure out what the real version of your website is. Google is seeing the URL from all four versions of your website. They’re going to try to figure out which URL is the real URL and just rank that one. The problem with that is you’re basically leaving that decision up to Google when it’s something that you could take control of for yourself.

There are a couple of other reasons that we’ll go into a little bit later for why duplicate content can be a problem. But in short, duplicate content is no good.

However, just having these URLs not resolve to each other may or may not be a huge problem. When it really becomes a serious issue is when that problem is combined with injudicious use of relative URLs in internal links. So let’s talk a little bit about the difference between a relative URL and an absolute URL when it comes to internal linking.

With an absolute URL, you are putting the entire web address of the page that you are linking to in the link. You’re putting your full domain, everything in the link, including /page. That’s an absolute URL.

However, when coding a website, it’s a fairly common web development practice to instead code internal links with what’s called a relative URL. A relative URL is just /page. Basically what that does is it relies on your browser to understand, “Okay, this link is pointing to a page that’s on the same domain that we’re already on. I’m just going to assume that that is the case and go there.”

There are a couple of really good reasons to code relative URLs

1) It is much easier and faster to code.

When you are a web developer and you’re building a site and there thousands of pages, coding relative versus absolute URLs is a way to be more efficient. You’ll see it happen a lot.

2) Staging environments

Another reason why you might see relative versus absolute URLs is some content management systems — and SharePoint is a great example of this — have a staging environment that’s on its own domain. Instead of being example.com, it will be examplestaging.com. The entire website will basically be replicated on that staging domain. Having relative versus absolute URLs means that the same website can exist on staging and on production, or the live accessible version of your website, without having to go back in and recode all of those URLs. Again, it’s more efficient for your web development team. Those are really perfectly valid reasons to do those things. So don’t yell at your web dev team if they’ve coded relative URLS, because from their perspective it is a better solution.

Relative URLs will also cause your page to load slightly faster. However, in my experience, the SEO benefits of having absolute versus relative URLs in your website far outweigh the teeny-tiny bit longer that it will take the page to load. It’s very negligible. If you have a really, really long page load time, there’s going to be a whole boatload of things that you can change that will make a bigger difference than coding your URLs as relative versus absolute.

Page load time, in my opinion, not a concern here. However, it is something that your web dev team may bring up with you when you try to address with them the fact that, from an SEO perspective, coding your website with relative versus absolute URLs, especially in the nav, is not a good solution.

There are even better reasons to use absolute URLs

1) Scrapers

If you have all of your internal links as relative URLs, it would be very, very, very easy for a scraper to simply scrape your whole website and put it up on a new domain, and the whole website would just work. That sucks for you, and it’s great for that scraper. But unless you are out there doing public services for scrapers, for some reason, that’s probably not something that you want happening with your beautiful, hardworking, handcrafted website. That’s one reason. There is a scraper risk.

2) Preventing duplicate content issues

But the other reason why it’s very important to have absolute versus relative URLs is that it really mitigates the duplicate content risk that can be presented when you don’t have all of these versions of your website resolving to one version. Google could potentially enter your site on any one of these four pages, which they’re the same page to you. They’re four different pages to Google. They’re the same domain to you. They are four different domains to Google.

But they could enter your site, and if all of your URLs are relative, they can then crawl and index your entire domain using whatever format these are. Whereas if you have absolute links coded, even if Google enters your site on www. and that resolves, once they crawl to another page, that you’ve got coded without the www., all of that other internal link juice and all of the other pages on your website, Google is not going to assume that those live at the www. version. That really cuts down on different versions of each page of your website. If you have relative URLs throughout, you basically have four different websites if you haven’t fixed this problem.

Again, it’s not always a huge issue. Duplicate content, it’s not ideal. However, Google has gotten pretty good at figuring out what the real version of your website is.

You do want to think about internal linking, when you’re thinking about this. If you have basically four different versions of any URL that anybody could just copy and paste when they want to link to you or when they want to share something that you’ve built, you’re diluting your internal links by four, which is not great. You basically would have to build four times as many links in order to get the same authority. So that’s one reason.

3) Crawl Budget

The other reason why it’s pretty important not to do is because of crawl budget. I’m going to point it out like this instead.

When we talk about crawl budget, basically what that is, is every time Google crawls your website, there is a finite depth that they will. There’s a finite number of URLs that they will crawl and then they decide, “Okay, I’m done.” That’s based on a few different things. Your site authority is one of them. Your actual PageRank, not toolbar PageRank, but how good Google actually thinks your website is, is a big part of that. But also how complex your site is, how often it’s updated, things like that are also going to contribute to how often and how deep Google is going to crawl your site.

It’s important to remember when we think about crawl budget that, for Google, crawl budget cost actual dollars. One of Google’s biggest expenditures as a company is the money and the bandwidth it takes to crawl and index the Web. All of that energy that’s going into crawling and indexing the Web, that lives on servers. That bandwidth comes from servers, and that means that using bandwidth cost Google actual real dollars.

So Google is incentivized to crawl as efficiently as possible, because when they crawl inefficiently, it cost them money. If your site is not efficient to crawl, Google is going to save itself some money by crawling it less frequently and crawling to a fewer number of pages per crawl. That can mean that if you have a site that’s updated frequently, your site may not be updating in the index as frequently as you’re updating it. It may also mean that Google, while it’s crawling and indexing, may be crawling and indexing a version of your website that isn’t the version that you really want it to crawl and index.

So having four different versions of your website, all of which are completely crawlable to the last page, because you’ve got relative URLs and you haven’t fixed this duplicate content problem, means that Google has to spend four times as much money in order to really crawl and understand your website. Over time they’re going to do that less and less frequently, especially if you don’t have a really high authority website. If you’re a small website, if you’re just starting out, if you’ve only got a medium number of inbound links, over time you’re going to see your crawl rate and frequency impacted, and that’s bad. We don’t want that. We want Google to come back all the time, see all our pages. They’re beautiful. Put them up in the index. Rank them well. That’s what we want. So that’s what we should do.

There are couple of ways to fix your relative versus absolute URLs problem

1) Fix what is happening on the server side of your website

You have to make sure that you are forcing all of these different versions of your domain to resolve to one version of your domain. For me, I’m pretty agnostic as to which version you pick. You should probably already have a pretty good idea of which version of your website is the real version, whether that’s www, non-www, HTTPS, or HTTP. From my view, what’s most important is that all four of these versions resolve to one version.

From an SEO standpoint, there is evidence to suggest and Google has certainly said that HTTPS is a little bit better than HTTP. From a URL length perspective, I like to not have the www. in there because it doesn’t really do anything. It just makes your URLs four characters longer. If you don’t know which one to pick, I would pick one this one HTTPS, no W’s. But whichever one you pick, what’s really most important is that all of them resolve to one version. You can do that on the server side, and that’s usually pretty easy for your dev team to fix once you tell them that it needs to happen.

2) Fix your internal links

Great. So you fixed it on your server side. Now you need to fix your internal links, and you need to recode them for being relative to being absolute. This is something that your dev team is not going to want to do because it is time consuming and, from a web dev perspective, not that important. However, you should use resources like this Whiteboard Friday to explain to them, from an SEO perspective, both from the scraper risk and from a duplicate content standpoint, having those absolute URLs is a high priority and something that should get done.

You’ll need to fix those, especially in your navigational elements. But once you’ve got your nav fixed, also pull out your database or run a Screaming Frog crawl or however you want to discover internal links that aren’t part of your nav, and make sure you’re updating those to be absolute as well.

Then you’ll do some education with everybody who touches your website saying, “Hey, when you link internally, make sure you’re using the absolute URL and make sure it’s in our preferred format,” because that’s really going to give you the most bang for your buck per internal link. So do some education. Fix your internal links.

Sometimes your dev team going to say, “No, we can’t do that. We’re not going to recode the whole nav. It’s not a good use of our time,” and sometimes they are right. The dev team has more important things to do. That’s okay.

3) Canonicalize it!

If you can’t get your internal links fixed or if they’re not going to get fixed anytime in the near future, a stopgap or a Band-Aid that you can kind of put on this problem is to canonicalize all of your pages. As you’re changing your server to force all of these different versions of your domain to resolve to one, at the same time you should be implementing the canonical tag on all of the pages of your website to self-canonize. On every page, you have a canonical page tag saying, “This page right here that they were already on is the canonical version of this page. ” Or if there’s another page that’s the canonical version, then obviously you point to that instead.

But having each page self-canonicalize will mitigate both the risk of duplicate content internally and some of the risk posed by scrappers, because when they scrape, if they are scraping your website and slapping it up somewhere else, those canonical tags will often stay in place, and that lets Google know this is not the real version of the website.

In conclusion, relative links, not as good. Absolute links, those are the way to go. Make sure that you’re fixing these very common domain level duplicate content problems. If your dev team tries to tell you that they don’t want to do this, just tell them I sent you. Thanks guys.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

How to Combat 5 of the SEO World’s Most Infuriating Problems – Whiteboard Friday

Posted by randfish

These days, most of us have learned that spammy techniques aren’t the way to go, and we have a solid sense for the things we should be doing to rank higher, and ahead of our often spammier competitors. Sometimes, maddeningly, it just doesn’t work. In today’s Whiteboard Friday, Rand talks about five things that can infuriate SEOs with the best of intentions, why those problems exist, and what we can do about them.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

What SEO problems make you angry?

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re chatting about some of the most infuriating things in the SEO world, specifically five problems that I think plague a lot of folks and some of the ways that we can combat and address those.

I’m going to start with one of the things that really infuriates a lot of new folks to the field, especially folks who are building new and emerging sites and are doing SEO on them. You have all of these best practices list. You might look at a web developer’s cheat sheet or sort of a guide to on-page and on-site SEO. You go, “Hey, I’m doing it. I’ve got my clean URLs, my good, unique content, my solid keyword targeting, schema markup, useful internal links, my XML sitemap, and my fast load speed. I’m mobile friendly, and I don’t have manipulative links.”

Great. “Where are my results? What benefit am I getting from doing all these things, because I don’t see one?” I took a site that was not particularly SEO friendly, maybe it’s a new site, one I just launched or an emerging site, one that’s sort of slowly growing but not yet a power player. I do all this right stuff, and I don’t get SEO results.

This makes a lot of people stop investing in SEO, stop believing in SEO, and stop wanting to do it. I can understand where you’re coming from. The challenge is not one of you’ve done something wrong. It’s that this stuff, all of these things that you do right, especially things that you do right on your own site or from a best practices perspective, they don’t increase rankings. They don’t. That’s not what they’re designed to do.

1) Following best practices often does nothing for new and emerging sites

This stuff, all of these best practices are designed to protect you from potential problems. They’re designed to make sure that your site is properly optimized so that you can perform to the highest degree that you are able. But this is not actually rank boosting stuff unfortunately. That is very frustrating for many folks. So following a best practices list, the idea is not, “Hey, I’m going to grow my rankings by doing this.”

On the flip side, many folks do these things on larger, more well-established sites, sites that have a lot of ranking signals already in place. They’re bigger brands, they have lots of links to them, and they have lots of users and usage engagement signals. You fix this stuff. You fix stuff that’s already broken, and boom, rankings pop up. Things are going well, and more of your pages are indexed. You’re getting more search traffic, and it feels great. This is a challenge, on our part, of understanding what this stuff does, not a challenge on the search engine’s part of not ranking us properly for having done all of these right things.

2) My competition seems to be ranking on the back of spammy or manipulative links

What’s going on? I thought Google had introduced all these algorithms to kind of shut this stuff down. This seems very frustrating. How are they pulling this off? I look at their link profile, and I see a bunch of the directories, Web 2.0 sites — I love that the spam world decided that that’s Web 2.0 sites — article sites, private blog networks, and do follow blogs.

You look at this stuff and you go, “What is this junk? It’s terrible. Why isn’t Google penalizing them for this?” The answer, the right way to think about this and to come at this is: Are these really the reason that they rank? I think we need to ask ourselves that question.

One thing that we don’t know, that we can never know, is: Have these links been disavowed by our competitor here?

I’ve got my HulksIncredibleStore.com and their evil competitor Hulk-tastrophe.com. Hulk-tastrophe has got all of these terrible links, but maybe they disavowed those links and you would have no idea. Maybe they didn’t build those links. Perhaps those links came in from some other place. They are not responsible. Google is not treating them as responsible for it. They’re not actually what’s helping them.

If they are helping, and it’s possible they are, there are still instances where we’ve seen spam propping up sites. No doubt about it.

I think the next logical question is: Are you willing to loose your site or brand? What we don’t see anymore is we almost never see sites like this, who are ranking on the back of these things and have generally less legitimate and good links, ranking for two or three or four years. You can see it for a few months, maybe even a year, but this stuff is getting hit hard and getting hit frequently. So unless you’re willing to loose your site, pursuing their links is probably not a strategy.

Then what other signals, that you might not be considering potentially links, but also non-linking signals, could be helping them rank? I think a lot of us get blinded in the SEO world by link signals, and we forget to look at things like: Do they have a phenomenal user experience? Are they growing their brand? Are they doing offline kinds of things that are influencing online? Are they gaining engagement from other channels that’s then influencing their SEO? Do they have things coming in that I can’t see? If you don’t ask those questions, you can’t really learn from your competitors, and you just feel the frustration.

3) I have no visibility or understanding of why my rankings go up vs down

On my HulksIncredibleStore.com, I’ve got my infinite stretch shorts, which I don’t know why he never wears — he should really buy those — my soothing herbal tea, and my anger management books. I look at my rankings and they kind of jump up all the time, jump all over the place all the time. Actually, this is pretty normal. I think we’ve done some analyses here, and the average page one search results shift is 1.5 or 2 position changes daily. That’s sort of the MozCast dataset, if I’m recalling correctly. That means that, over the course of a week, it’s not uncommon or unnatural for you to be bouncing around four, five, or six positions up, down, and those kind of things.

I think we should understand what can be behind these things. That’s a very simple list. You made changes, Google made changes, your competitors made changes, or searcher behavior has changed in terms of volume, in terms of what they were engaging with, what they’re clicking on, what their intent behind searches are. Maybe there was just a new movie that came out and in one of the scenes Hulk talks about soothing herbal tea. So now people are searching for very different things than they were before. They want to see the scene. They’re looking for the YouTube video clip and those kind of things. Suddenly Hulk’s soothing herbal tea is no longer directing as well to your site.

So changes like these things can happen. We can’t understand all of them. I think what’s up to us to determine is the degree of analysis and action that’s actually going to provide a return on investment. Looking at these day over day or week over week and throwing up our hands and getting frustrated probably provides very little return on investment. Looking over the long term and saying, “Hey, over the last 6 months, we can observe 26 weeks of ranking change data, and we can see that in aggregate we are now ranking higher and for more keywords than we were previously, and so we’re going to continue pursuing this strategy. This is the set of keywords that we’ve fallen most on, and here are the factors that we’ve identified that are consistent across that group.” I think looking at rankings in aggregate can give us some real positive ROI. Looking at one or two, one week or the next week probably very little ROI.

4) I cannot influence or affect change in my organization because I cannot accurately quantify, predict, or control SEO

That’s true, especially with things like keyword not provided and certainly with the inaccuracy of data that’s provided to us through Google’s Keyword Planner inside of AdWords, for example, and the fact that no one can really control SEO, not fully anyway.

You get up in front of your team, your board, your manager, your client and you say, “Hey, if we don’t do these things, traffic will suffer,” and they go, “Well, you can’t be sure about that, and you can’t perfectly predict it. Last time you told us something, something else happened. So because the data is imperfect, we’d rather spend money on channels that we can perfectly predict, that we can very effectively quantify, and that we can very effectively control.” That is understandable. I think that businesses have a lot of risk aversion naturally, and so wanting to spend time and energy and effort in areas that you can control feels a lot safer.

Some ways to get around this are, first off, know your audience. If you know who you’re talking to in the room, you can often determine the things that will move the needle for them. For example, I find that many managers, many boards, many executives are much more influenced by competitive pressures than they are by, “We won’t do as well as we did before, or we’re loosing out on this potential opportunity.” Saying that is less powerful than saying, “This competitor, who I know we care about and we track ourselves against, is capturing this traffic and here’s how they’re doing it.”

Show multiple scenarios. Many of the SEO presentations that I see and have seen and still see from consultants and from in-house folks come with kind of a single, “Hey, here’s what we predict will happen if we do this or what we predict will happen if we don’t do this.” You’ve got to show multiple scenarios, especially when you know you have error bars because you can’t accurately quantify and predict. You need to show ranges.

So instead of this, I want to see: What happens if we do it a little bit? What happens if we really overinvest? What happens if Google makes a much bigger change on this particular factor than we expect or our competitors do a much bigger investment than we expect? How might those change the numbers?

Then I really do like bringing case studies, especially if you’re a consultant, but even in-house there are so many case studies in SEO on the Web today, you can almost always find someone who’s analogous or nearly analogous and show some of their data, some of the results that they’ve seen. Places like SEMrush, a tool that offers competitive intelligence around rankings, can be great for that. You can show, hey, this media site in our sector made these changes. Look at the delta of keywords they were ranking for versus R over the next six months. Correlation is not causation, but that can be a powerful influencer showing those kind of things.

Then last, but not least, any time you’re going to get up like this and present to a group around these topics, if you very possibly can, try to talk one-on-one with the participants before the meeting actually happens. I have found it almost universally the case that when you get into a group setting, if you haven’t had the discussions beforehand about like, “What are your concerns? What do you think is not valid about this data? Hey, I want to run this by you and get your thoughts before we go to the meeting.” If you don’t do that ahead of time, people can gang up and pile on. One person says, “Hey, I don’t think this is right,” and everybody in the room kind of looks around and goes, “Yeah, I also don’t think that’s right.” Then it just turns into warfare and conflict that you don’t want or need. If you address those things beforehand, then you can include the data, the presentations, and the “I don’t know the answer to this and I know this is important to so and so” in that presentation or in that discussion. It can be hugely helpful. Big difference between winning and losing with that.

5) Google is biasing to big brands. It feels hopeless to compete against them

A lot of people are feeling this hopelessness, hopelessness in SEO about competing against them. I get that pain. In fact, I’ve felt that very strongly for a long time in the SEO world, and I think the trend has only increased. This comes from all sorts of stuff. Brands now have the little dropdown next to their search result listing. There are these brand and entity connections. As Google is using answers and knowledge graph more and more, it’s feeling like those entities are having a bigger influence on where things rank and where they’re visible and where they’re pulling from.

User and usage behavior signals on the rise means that big brands, who have more of those signals, tend to perform better. Brands in the knowledge graph, brands growing links without any effort, they’re just growing links because they’re brands and people point to them naturally. Well, that is all really tough and can be very frustrating.

I think you have a few choices on the table. First off, you can choose to compete with brands where they can’t or won’t. So this is areas like we’re going after these keywords that we know these big brands are not chasing. We’re going after social channels or people on social media that we know big brands aren’t. We’re going after user generated content because they have all these corporate requirements and they won’t invest in that stuff. We’re going after content that they refuse to pursue for one reason or another. That can be very effective.

You better be building, growing, and leveraging your competitive advantage. Whenever you build an organization, you’ve got to say, “Hey, here’s who is out there. This is why we are uniquely better or a uniquely better choice for this set of customers than these other ones.” If you can leverage that, you can generally find opportunities to compete and even to win against big brands. But those things have to become obvious, they have to become well-known, and you need to essentially build some of your brand around those advantages, or they’re not going to give you help in search. That includes media, that includes content, that includes any sort of press and PR you’re doing. That includes how you do your own messaging, all of these things.

(C) You can choose to serve a market or a customer that they don’t or won’t. That can be a powerful way to go about search, because usually search is bifurcated by the customer type. There will be slightly different forms of search queries that are entered by different kinds of customers, and you can pursue one of those that isn’t pursued by the competition.

Last, but not least, I think for everyone in SEO we all realize we’re going to have to become brands ourselves. That means building the signals that are typically associated with brands — authority, recognition from an industry, recognition from a customer set, awareness of our brand even before a search has happened. I talked about this in a previous Whiteboard Friday, but I think because of these things, SEO is becoming a channel that you benefit from as you grow your brand rather than the channel you use to initially build your brand.

All right, everyone. Hope these have been helpful in combating some of these infuriating, frustrating problems and that we’ll see some great comments from you guys. I hope to participate in those as well, and we’ll catch you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Why the Links You’ve Built Aren’t Helping Your Page Rank Higher – Whiteboard Friday

Posted by randfish

Link building can be incredibly effective, but sometimes a lot of effort can go into earning links with absolutely no improvement in rankings. Why? In today’s Whiteboard Friday, Rand shows us four things we should look at in these cases, help us hone our link building skills and make the process more effective.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

Video transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re chatting about why link building sometimes fails.

So I’ve got an example here. I’m going to do a search for artificial sweeteners. Let’s say I’m working for these guys, ScienceMag.org. Well, this is actually in position 10. I put it in position 3 here, but I see that I’m position 10. I think to myself, “Man, if I could get higher up on this page, that would be excellent. I’ve already produced the content. It’s on my domain. Like, Google seems to have indexed it fine. It’s performing well enough to perform on page one, granted at the bottom of page one, for this competitive query. Now I want to move my rankings up.”

So a lot of SEOs, naturally and historically, for a long time have thought, “I need to build more links to that page. If I can get more links pointing to this page, I can move up the rankings.” Granted, there are some other ways to do that too, and we’ve discussed those in previous Whiteboard Fridays. But links are one of the big ones that people use.

I think one of the challenges that we encounter is sometimes we invest that effort. We go through the process of that outreach campaign, talking to bloggers and other news sites and looking at where our link sources are coming from and trying to get some more of those. It just doesn’t seem to do anything. The link building appears to fail. It’s like, man, I’ve got all these nice links and no new results. I didn’t move up at all. I am basically staying where I am, or maybe I’m even falling down. Why is that? Why does link building sometimes work so well and so clearly and obviously, and sometimes it seems to do nothing at all?

What are some possible reasons link acquisition efforts may not be effective?

Oftentimes if you get a fresh set of eyes on it, an outside SEO perspective, they can do this audit, and they’ll walk through a lot of this stuff and help you realize, “Oh yeah, that’s probably why.” These are things that you might need to change strategically or tactically as you approach this problem. But you can do this yourself as well by looking at why a link building campaign, why a link building effort, for a particular page, might not be working.

1) Not the right links

First one, it’s not the right links. Not the right links, I mean a wide range of things, even broader than what I’ve listed here. But a lot of times that could mean low domain diversity. Yeah, you’re getting new links, but they’re coming from all the same places that you always get links from. Google, potentially, maybe views that as not particularly worthy of moving you up the rankings, especially around competitive queries.

It might be trustworthiness of source. So maybe they’re saying “Yeah, you got some links, but they’re not from particularly trustworthy places.” Tied into that maybe we don’t think or we’re sure that they’re not editorial. Maybe we think they’re paid, or we think they’re promotional in some way rather than being truly editorially given by this independent resource.

They might not come from a site or from a page that has the authority that’s necessary to move you up. Again, particularly for competitive queries, sometimes low-value links are just that. They’re not going to move the needle, especially not like they used to three, four, five or six years ago, where really just a large quantity of links, even from diverse domains, even if they were crappy links on crappy pages on relatively crappy or unknown websites would move the needle, not so much anymore. Google is seeing a lot more about these things.

Where else does the source link to? Is that source pointing to other stuff that is potentially looking manipulative to Google and so they discounted the outgoing links from that particular domain or those sites or those pages on those sites?

They might look at the relevance and say, “Hey, you know what? Yeah, you got linked to by some technology press articles. That doesn’t really have anything to do with artificial sweeteners, this topic, this realm, or this region.” So you’re not getting the same result. Now we’ve shown that off-topic links can oftentimes move the rankings, but in particular areas and in health, in fact, may be one of those Google might be more topically sensitive to where the links are coming from than other places.

Location on page. So I’ve got a page here and maybe all of my links are coming from a bunch of different domains, but it’s always in the right sidebar and it’s always in this little feed section. So Google’s saying, “Hey, that’s not really an editorial endorsement. That’s just them showing all the links that come through your particular blog feed or a subscription that they’ve got to your content or whatever it is promotionally pushing out. So we’re not going to count it that way.” Same thing a lot of times with footer links. Doesn’t work quite as well. If you’re being honest with yourself, you really want those in content links. Generally speaking, those tend to perform the best.

Or uniqueness. So they might look and they might say, “Yeah, you’ve got a ton of links from people who are republishing your same article and then just linking back to it. That doesn’t feel to us like an editorial endorsement, and so we’re just going to treat those copies as if those links didn’t exist at all.” But the links themselves may not actually be the problem. I think this can be a really important topic if you’re doing link acquisition auditing, because sometimes people get too focused on, “Oh, it must be something about the links that we’re getting.” That’s not always the case actually.

2) Not the right content

Sometimes it’s not the right content. So that could mean things like it’s temporally focused versus evergreen. So for different kinds of queries, Google interprets the intent of the searchers to be different. So it could be that when they see a search like “artificial sweeteners,” they say, “Yeah, it’s great that you wrote this piece about this recent research that came out. But you know what, we’re actually thinking that searchers are going to want in the top few results something that’s evergreen, that contains all the broad information that a searcher might need around this particular topic.”

That speaks to it might not answer the searchers questions. You might think, “Well, I’m answering a great question here.” The problem is, yeah you’re answering one. Searchers may have many questions that they’re asking around a topic, and Google is looking for something comprehensive, something that doesn’t mean a searcher clicks your result and then says, “Well, that was interesting, but I need more from a different result.” They’re looking for the one true result, the one true answer that tells them, “Hey, this person is very happy with these types of results.”

It could be poor user experience causing people to bounce back. That could be speed things, UI things, layout things, browser support things, multi-device support things. It might not use language formatting or text that people or engines can interpret as on the topic. Perhaps this is way over people’s heads, far too scientifically focused, most searchers can’t understand the language, or the other way around. It’s a highly scientific search query and a very advanced search query and your language is way dumbed down. Google isn’t interpreting that as on-topic. All the Hummingbird and topic modeling kind of things that they have say this isn’t for them.

Or it might not match expectations of searchers. This is distinct and different from searchers’ questions. So searchers’ questions is, “I want to know how artificial sweeteners might affect me.” Expectations might be, “I expect to learn this kind of information. I expect to find out these things.” For example, if you go down a rabbit hole of artificial sweeteners will make your skin shiny, they’re like, “Well, that doesn’t meet with my expectation. I don’t think that’s right.” Even if you have some data around that, that’s not what they were expecting to find. They might bounce back. Engines might not interpret you as on-topic, etc. So lots of content kinds of things.

3) Not the right domain

Then there are also domain issues. You might not have the right domain. Your domain might not be associated with the topic or content that Google and searchers are expecting. So they see Mayo Clinic, they see MedicineNet, and they go, “ScienceMag? Do they do health information? I don’t think they do. I’m not sure if that’s an appropriate one.” It might be perceived, even if you aren’t, as spammy or manipulative by Google, more probably than by searchers. Or searchers just won’t click your brand for that content. This is a very frustrating one, because we have seen a ton of times when search behavior is biased by the brand itself, by what’s in this green text here, the domain name or the brand name that Google might show there. That’s very frustrating, but it means that you need to build brand affinity between that topic, that keyword, and what’s in searchers’ heads.

4) Accessibility or technical issues

Then finally, there could be some accessibility or technical issues. Usually when that’s the case, you will notice pretty easily because the page will have an error. It won’t show the content properly. The cache will be an issue. That’s a rare one, but you might want to check for it as well.

But hopefully, using this kind of an audit system, you can figure out why a link building campaign, a link building effort isn’t working to move the needle on your rankings.

With that, we will see you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

​The 3 Most Common SEO Problems on Listings Sites

Posted by Dom-Woodman

Listings sites have a very specific set of search problems that you don’t run into everywhere else. In the day I’m one of Distilled’s analysts, but by night I run a job listings site, teflSearch. So, for my first Moz Blog post I thought I’d cover the three search problems with listings sites that I spent far too long agonising about.

Quick clarification time: What is a listings site (i.e. will this post be useful for you)?

The classic listings site is Craigslist, but plenty of other sites act like listing sites:

  • Job sites like Monster
  • E-commerce sites like Amazon
  • Matching sites like Spareroom

1. Generating quality landing pages

The landing pages on listings sites are incredibly important. These pages are usually the primary drivers of converting traffic, and they’re usually generated automatically (or are occasionally custom category pages) .

For example, if I search “Jobs in Manchester“, you can see nearly every result is an automatically generated landing page or category page.

There are three common ways to generate these pages (occasionally a combination of more than one is used):

  • Faceted pages: These are generated by facets—groups of preset filters that let you filter the current search results. They usually sit on the left-hand side of the page.
  • Category pages: These pages are listings which have already had a filter applied and can’t be changed. They’re usually custom pages.
  • Free-text search pages: These pages are generated by a free-text search box.

Those definitions are still bit general; let’s clear them up with some examples:

Amazon uses a combination of categories and facets. If you click on browse by department you can see all the category pages. Then on each category page you can see a faceted search. Amazon is so large that it needs both.

Indeed generates its landing pages through free text search, for example if we search for “IT jobs in manchester” it will generate: IT jobs in manchester.

teflSearch generates landing pages using just facets. The jobs in China landing page is simply a facet of the main search page.

Each method has its own search problems when used for generating landing pages, so lets tackle them one by one.


Aside

Facets and free text search will typically generate pages with parameters e.g. a search for “dogs” would produce:

www.mysite.com?search=dogs

But to make the URL user friendly sites will often alter the URLs to display them as folders

www.mysite.com/results/dogs/

These are still just ordinary free text search and facets, the URLs are just user friendly. (They’re a lot easier to work with in robots.txt too!)


Free search (& category) problems

If you’ve decided the base of your search will be a free text search, then we’ll have two major goals:

  • Goal 1: Helping search engines find your landing pages
  • Goal 2: Giving them link equity.

Solution

Search engines won’t use search boxes and so the solution to both problems is to provide links to the valuable landing pages so search engines can find them.

There are plenty of ways to do this, but two of the most common are:

  • Category links alongside a search

    Photobucket uses a free text search to generate pages, but if we look at example search for photos of dogs, we can see the categories which define the landing pages along the right-hand side. (This is also an example of URL friendly searches!)

  • Putting the main landing pages in a top-level menu

    Indeed also uses free text to generate landing pages, and they have a browse jobs section which contains the URL structure to allow search engines to find all the valuable landing pages.

Breadcrumbs are also often used in addition to the two above and in both the examples above, you’ll find breadcrumbs that reinforce that hierarchy.

Category (& facet) problems

Categories, because they tend to be custom pages, don’t actually have many search disadvantages. Instead it’s the other attributes that make them more or less desirable. You can create them for the purposes you want and so you typically won’t have too many problems.

However, if you also use a faceted search in each category (like Amazon) to generate additional landing pages, then you’ll run into all the problems described in the next section.

At first facets seem great, an easy way to generate multiple strong relevant landing pages without doing much at all. The problems appear because people don’t put limits on facets.

Lets take the job page on teflSearch. We can see it has 18 facets each with many options. Some of these options will generate useful landing pages:

The China facet in countries will generate “Jobs in China” that’s a useful landing page.

On the other hand, the “Conditional Bonus” facet will generate “Jobs with a conditional bonus,” and that’s not so great.

We can also see that the options within a single facet aren’t always useful. As of writing, I have a single job available in Serbia. That’s not a useful search result, and the poor user engagement combined with the tiny amount of content will be a strong signal to Google that it’s thin content. Depending on the scale of your site it’s very easy to generate a mass of poor-quality landing pages.

Facets generate other problems too. The primary one being they can create a huge amount of duplicate content and pages for search engines to get lost in. This is caused by two things: The first is the sheer number of possibilities they generate, and the second is because selecting facets in different orders creates identical pages with different URLs.

We end up with four goals for our facet-generated landing pages:

  • Goal 1: Make sure our searchable landing pages are actually worth landing on, and that we’re not handing a mass of low-value pages to the search engines.
  • Goal 2: Make sure we don’t generate multiple copies of our automatically generated landing pages.
  • Goal 3: Make sure search engines don’t get caught in the metaphorical plastic six-pack rings of our facets.
  • Goal 4: Make sure our landing pages have strong internal linking.

The first goal needs to be set internally; you’re always going to be the best judge of the number of results that need to present on a page in order for it to be useful to a user. I’d argue you can rarely ever go below three, but it depends both on your business and on how much content fluctuates on your site, as the useful landing pages might also change over time.

We can solve the next three problems as group. There are several possible solutions depending on what skills and resources you have access to; here are two possible solutions:

Category/facet solution 1: Blocking the majority of facets and providing external links
  • Easiest method
  • Good if your valuable category pages rarely change and you don’t have too many of them.
  • Can be problematic if your valuable facet pages change a lot

Nofollow all your facet links, and noindex and block category pages which aren’t valuable or are deeper than x facet/folder levels into your search using robots.txt.

You set x by looking at where your useful facet pages exist that have search volume. So, for example, if you have three facets for televisions: manufacturer, size, and resolution, and even combinations of all three have multiple results and search volume, then you could set you index everything up to three levels.

On the other hand, if people are searching for three levels (e.g. “Samsung 42″ Full HD TV”) but you only have one or two results for three-level facets, then you’d be better off indexing two levels and letting the product pages themselves pick up long-tail traffic for the third level.

If you have valuable facet pages that exist deeper than 1 facet or folder into your search, then this creates some duplicate content problems dealt with in the aside “Indexing more than 1 level of facets” below.)

The immediate problem with this set-up, however, is that in one stroke we’ve removed most of the internal links to our category pages, and by no-following all the facet links, search engines won’t be able to find your valuable category pages.

In order re-create the linking, you can add a top level drop down menu to your site containing the most valuable category pages, add category links elsewhere on the page, or create a separate part of the site with links to the valuable category pages.

The top level drop down menu you can see on teflSearch (it’s the search jobs menu), the other two examples are demonstrated in Photobucket and Indeed respectively in the previous section.

The big advantage for this method is how quick it is to implement, it doesn’t require any fiddly internal logic and adding an extra menu option is usually minimal effort.

Category/facet solution 2: Creating internal logic to work with the facets

  • Requires new internal logic
  • Works for large numbers of category pages with value that can change rapidly

There are four parts to the second solution:

  1. Select valuable facet categories and allow those links to be followed. No-follow the rest.
  2. No-index all pages that return a number of items below the threshold for a useful landing page
  3. No-follow all facets on pages with a search depth greater than x.
  4. Block all facet pages deeper than x level in robots.txt

As with the last solution, x is set by looking at where your useful facet pages exist that have search volume (full explanation in the first solution), and if you’re indexing more than one level you’ll need to check out the aside below to see how to deal with the duplicate content it generates.


Aside: Indexing more than one level of facets

If you want more than one level of facets to be indexable, then this will create certain problems.

Suppose you have a facet for size:

  • Televisions: Size: 46″, 44″, 42″

And want to add a brand facet:

  • Televisions: Brand: Samsung, Panasonic, Sony

This will create duplicate content because the search engines will be able to follow your facets in both orders, generating:

  • Television – 46″ – Samsung
  • Television – Samsung – 46″

You’ll have to either rel canonical your duplicate pages with another rule or set up your facets so they create a single unique URL.

You also need to be aware that each followable facet you add will multiply with each other followable facet and it’s very easy to generate a mass of pages for search engines to get stuck in. Depending on your setup you might need to block more paths in robots.txt or set-up more logic to prevent them being followed.

Letting search engines index more than one level of facets adds a lot of possible problems; make sure you’re keeping track of them.


2. User-generated content cannibalization

This is a common problem for listings sites (assuming they allow user generated content). If you’re reading this as an e-commerce site who only lists their own products, you can skip this one.

As we covered in the first area, category pages on listings sites are usually the landing pages aiming for the valuable search terms, but as your users start generating pages they can often create titles and content that cannibalise your landing pages.

Suppose you’re a job site with a category page for PHP Jobs in Greater Manchester. If a recruiter then creates a job advert for PHP Jobs in Greater Manchester for the 4 positions they currently have, you’ve got a duplicate content problem.

This is less of a problem when your site is large and your categories mature, it will be obvious to any search engine which are your high value category pages, but at the start where you’re lacking authority and individual listings might contain more relevant content than your own search pages this can be a problem.

Solution 1: Create structured titles

Set the <title> differently than the on-page title. Depending on variables you have available to you can set the title tag programmatically without changing the page title using other information given by the user.

For example, on our imaginary job site, suppose the recruiter also provided the following information in other fields:

  • The no. of positions: 4
  • The primary area: PHP Developer
  • The name of the recruiting company: ABC Recruitment
  • Location: Manchester

We could set the <title> pattern to be: *No of positions* *The primary area* with *recruiter name* in *Location* which would give us:

4 PHP Developers with ABC Recruitment in Manchester

Setting a <title> tag allows you to target long-tail traffic by constructing detailed descriptive titles. In our above example, imagine the recruiter had specified “Castlefield, Manchester” as the location.

All of a sudden, you’ve got a perfect opportunity to pick up long-tail traffic for people searching in Castlefield in Manchester.

On the downside, you lose the ability to pick up long-tail traffic where your users have chosen keywords you wouldn’t have used.

For example, suppose Manchester has a jobs program called “Green Highway.” A job advert title containing “Green Highway” might pick up valuable long-tail traffic. Being able to discover this, however, and find a way to fit it into a dynamic title is very hard.

Solution 2: Use regex to noindex the offending pages

Perform a regex (or string contains) search on your listings titles and no-index the ones which cannabalise your main category pages.

If it’s not possible to construct titles with variables or your users provide a lot of additional long-tail traffic with their own titles, then is a great option. On the downside, you miss out on possible structured long-tail traffic that you might’ve been able to aim for.

Solution 3: De-index all your listings

It may seem rash, but if you’re a large site with a huge number of very similar or low-content listings, you might want to consider this, but there is no common standard. Some sites like Indeed choose to no-index all their job adverts, whereas some other sites like Craigslist index all their individual listings because they’ll drive long tail traffic.

Don’t de-index them all lightly!

3. Constantly expiring content

Our third and final problem is that user-generated content doesn’t last forever. Particularly on listings sites, it’s constantly expiring and changing.

For most use cases I’d recommend 301’ing expired content to a relevant category page, with a message triggered by the redirect notifying the user of why they’ve been redirected. It typically comes out as the best combination of search and UX.

For more information or advice on how to deal with the edge cases, there’s a previous Moz blog post on how to deal with expired content which I think does an excellent job of covering this area.

Summary

In summary, if you’re working with listings sites, all three of the following need to be kept in mind:

  • How are the landing pages generated? If they’re generated using free text or facets have the potential problems been solved?
  • Is user generated content cannibalising the main landing pages?
  • How has constantly expiring content been dealt with?

Good luck listing, and if you’ve had any other tricky problems or solutions you’ve come across working on listings sites lets chat about them in the comments below!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it