dotmailer becomes EU-U.S. Privacy Shield certified

On 12 August we were accepted for the U.S. Department of Commerce’s voluntary privacy certification program. The news is a great milestone for dotmailer, because it recognizes the years of work we’ve put into protecting our customers’ data and privacy. For instance, just look at our comprehensive trust center and involvement in both the International Association of Privacy Professionals (IAPP) and Email Sender & Provider Coalition (ESPC).

To become certified our Chief Privacy Officer, James Koons, made the application to the U.S. Department of Commerce, who audited dotmailer’s privacy statement. (Interesting fact: James actually completed the application process while on vacation climbing Mt. Rainer in Washington state!)

By self-certifying and agreeing to the Privacy Shield Principles, it means that our commitment is enforceable under the Federal Trade Commission (FTC).

What does it mean for you (our customers)?

As we continue to expand globally, this certification is one more important privacy precedent. The aim of the EU-U.S. Privacy Shield, which was recently finalized, provides businesses with stronger protection for the exchange of transatlantic data. If you haven’t seen it already, you might be interested in reading about the recent email privacy war between Microsoft and the U.S. government.

As a certified company, it means we must provide you with adequate privacy protection – a requirement for the transfer of personal data outside of the European Union under the EU Data Protection Directive. Each year, we must self-certify to the U.S. Department of Commerce’s International Trade Administration (ITA), to ensure we adhere to the Privacy Shield Principles.

What does our Chief Privacy Officer think?

James Koons, who has 20 years’ experience in the information systems and security industry, explained why he’s pleased about the news: “I am delighted that dotmailer has been recognized as a good steward of data through the Privacy Shield Certification.

“As a company that has a culture of privacy and security as its core, I believe the certification simply highlights the great work we have already been doing.”

What happened to the Safe Harbour agreement?

The EU-U.S. Privacy Shield replaces the former Safe Harbour agreement for transatlantic data transfers.

Want to know more about what the Privacy Shield means?

You can check out the official Privacy Shield website here, which gives a more detailed overview of the program and requirements for participating organizations.

Reblogged 2 years ago from blog.dotmailer.com

The Meta Referrer Tag: An Advancement for SEO and the Internet

Posted by Cyrus-Shepard

The movement to make the Internet more secure through HTTPS brings several useful advancements for webmasters. In addition to security improvements, HTTPS promises future technological advances and potential SEO benefits for marketers.

HTTPS in search results is rising. Recent MozCast data from Dr. Pete shows nearly 20% of first page Google results are now HTTPS.

Sadly, HTTPS also has its downsides.

Marketers run into their first challenge when they switch regular HTTP sites over to HTTPS. Technically challenging, the switch typically involves routing your site through a series of 301 redirects. Historically, these types of redirects are associated with a loss of link equity (thought to be around 15%) which can lead to a loss in rankings. This can offset any SEO advantage that Google claims switching.

Ross Hudgens perfectly summed it up in this tweet:

Many SEOs have anecdotally shared stories of HTTPS sites performing well in Google search results (and our soon-to-be-published Ranking Factors data seems to support this.) However, the short term effect of a large migration can be hard to take. When Moz recently switched to HTTPS to provide better security to our logged-in users, we saw an 8-9% dip in our organic search traffic.

Problem number two is the subject of this post. It involves the loss of referral data. Typically, when one site sends traffic to another, information is sent that identifies the originating site as the source of traffic. This invaluable data allows people to see where their traffic is coming from, and helps spread the flow of information across the web.

SEOs have long used referrer data for a number of beneficial purposes. Oftentimes, people will link back or check out the site sending traffic when they see the referrer in their analytics data. Spammers know this works, as evidenced by the recent increase in referrer spam:

This process stops when traffic flows from an HTTPS site to a non-secure HTTP site. In this case, no referrer data is sent. Webmasters can’t know where their traffic is coming from.

Here’s how referral data to my personal site looked when Moz switched to HTTPS. I lost all visibility into where my traffic came from.

Its (not provided) all over again!

Enter the meta referrer tag

While we can’t solve the ranking challenges imposed by switching a site to HTTPS, we can solve the loss of referral data, and it’s actually super-simple.

Almost completely unknown to most marketers, the relatively new meta referrer tag (it’s actually been around for a few years) was designed to help out in these situations.

Better yet, the tag allows you to control how your referrer information is passed.

The meta referrer tag works with most browsers to pass referrer information in a manner defined by the user. Traffic remains encrypted and all the benefits of using HTTPS remain in place, but now you can pass referrer data to all websites, even those that use HTTP.

How to use the meta referrer tag

What follows are extremely simplified instructions for using the meta referrer tag. For more in-depth understanding, we highly recommend referring to the W3C working draft of the spec.

The meta referrer tag is placed in the <head> section of your HTML, and references one of five states, which control how browsers send referrer information from your site. The five states are:

  1. None: Never pass referral data
    <meta name="referrer" content="none">
    
  2. None When Downgrade: Sends referrer information to secure HTTPS sites, but not insecure HTTP sites
    <meta name="referrer" content="none-when-downgrade">
    
  3. Origin Only: Sends the scheme, host, and port (basically, the subdomain) stripped of the full URL as a referrer, i.e. https://moz.com/example.html would simply send https://moz.com
    <meta name="referrer" content="origin">
    

  4. Origin When Cross-Origin: Sends the full URL as the referrer when the target has the same scheme, host, and port (i.e. subdomain) regardless if it’s HTTP or HTTPS, while sending origin-only referral information to external sites. (note: There is a typo in the official spec. Future versions should be “origin-when-cross-origin”)
    <meta name="referrer" content="origin-when-crossorigin">
    
  5. Unsafe URL: Always passes the URL string as a referrer. Note if you have any sensitive information contained in your URL, this isn’t the safest option. By default, URL fragments, username, and password are automatically stripped out.
    <meta name="referrer" content="unsafe-url">
    

The meta referrer tag in action

By clicking the link below, you can get a sense of how the meta referrer tag works.

Check Referrer

Boom!

We’ve set the meta referrer tag for Moz to “origin”, which means when we link out to another site, we pass our scheme, host, and port. The end result is you see http://moz.com as the referrer, stripped of the full URL path (/meta-referrer-tag).

My personal site typically receives several visits per day from Moz. Here’s what my analytics data looked like before and after we implemented the meta referrer tag.

For simplicity and security, most sites may want to implement the “origin” state, but there are drawbacks.

One negative side effect was that as soon as we implemented the meta referrer tag, our AdRoll analytics, which we use for retargeting, stopped working. It turns out that AdRoll uses our referrer information for analytics, but the meta referrer tag “origin” state meant that the only URL they ever saw reported was https://moz.com.

Conclusion

We love the meta referrer tag because it keeps information flowing on the Internet. It’s the way the web is supposed to work!

It helps marketers and webmasters see exactly where their traffic is coming from. It encourages engagement, communication, and even linking, which can lead to improvements in SEO.

Useful links:

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 3 years ago from tracking.feedpress.it

Why We Can’t Do Keyword Research Like It’s 2010 – Whiteboard Friday

Posted by randfish

Keyword Research is a very different field than it was just five years ago, and if we don’t keep up with the times we might end up doing more harm than good. From the research itself to the selection and targeting process, in today’s Whiteboard Friday Rand explains what has changed and what we all need to do to conduct effective keyword research today.

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

What do we need to change to keep up with the changing world of keyword research?

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re going to chat a little bit about keyword research, why it’s changed from the last five, six years and what we need to do differently now that things have changed. So I want to talk about changing up not just the research but also the selection and targeting process.

There are three big areas that I’ll cover here. There’s lots more in-depth stuff, but I think we should start with these three.

1) The Adwords keyword tool hides data!

This is where almost all of us in the SEO world start and oftentimes end with our keyword research. We go to AdWords Keyword Tool, what used to be the external keyword tool and now is inside AdWords Ad Planner. We go inside that tool, and we look at the volume that’s reported and we sort of record that as, well, it’s not good, but it’s the best we’re going to do.

However, I think there are a few things to consider here. First off, that tool is hiding data. What I mean by that is not that they’re not telling the truth, but they’re not telling the whole truth. They’re not telling nothing but the truth, because those rounded off numbers that you always see, you know that those are inaccurate. Anytime you’ve bought keywords, you’ve seen that the impression count never matches the count that you see in the AdWords tool. It’s not usually massively off, but it’s often off by a good degree, and the only thing it’s great for is telling relative volume from one from another.

But because AdWords hides data essentially by saying like, “Hey, you’re going to type in . . .” Let’s say I’m going to type in “college tuition,” and Google knows that a lot of people search for how to reduce college tuition, but that doesn’t come up in the suggestions because it’s not a commercial term, or they don’t think that an advertiser who bids on that is going to do particularly well and so they don’t show it in there. I’m giving an example. They might indeed show that one.

But because that data is hidden, we need to go deeper. We need to go beyond and look at things like Google Suggest and related searches, which are down at the bottom. We need to start conducting customer interviews and staff interviews, which hopefully has always been part of your brainstorming process but really needs to be now. Then you can apply that to AdWords. You can apply that to suggest and related.

The beautiful thing is once you get these tools from places like visiting forums or communities, discussion boards and seeing what terms and phrases people are using, you can collect all this stuff up, plug it back into AdWords, and now they will tell you how much volume they’ve got. So you take that how to lower college tuition term, you plug it into AdWords, they will show you a number, a non-zero number. They were just hiding it in the suggestions because they thought, “Hey, you probably don’t want to bid on that. That won’t bring you a good ROI.” So you’ve got to be careful with that, especially when it comes to SEO kinds of keyword research.

2) Building separate pages for each term or phrase doesn’t make sense

It used to be the case that we built separate pages for every single term and phrase that was in there, because we wanted to have the maximum keyword targeting that we could. So it didn’t matter to us that college scholarship and university scholarships were essentially people looking for exactly the same thing, just using different terminology. We would make one page for one and one page for the other. That’s not the case anymore.

Today, we need to group by the same searcher intent. If two searchers are searching for two different terms or phrases but both of them have exactly the same intent, they want the same information, they’re looking for the same answers, their query is going to be resolved by the same content, we want one page to serve those, and that’s changed up a little bit of how we’ve done keyword research and how we do selection and targeting as well.

3) Build your keyword consideration and prioritization spreadsheet with the right metrics

Everybody’s got an Excel version of this, because I think there’s just no awesome tool out there that everyone loves yet that kind of solves this problem for us, and Excel is very, very flexible. So we go into Excel, we put in our keyword, the volume, and then a lot of times we almost stop there. We did keyword volume and then like value to the business and then we prioritize.

What are all these new columns you’re showing me, Rand? Well, here I think is how sophisticated, modern SEOs that I’m seeing in the more advanced agencies, the more advanced in-house practitioners, this is what I’m seeing them add to the keyword process.

Difficulty

A lot of folks have done this, but difficulty helps us say, “Hey, this has a lot of volume, but it’s going to be tremendously hard to rank.”

The difficulty score that Moz uses and attempts to calculate is a weighted average of the top 10 domain authorities. It also uses page authority, so it’s kind of a weighted stack out of the two. If you’re seeing very, very challenging pages, very challenging domains to get in there, it’s going to be super hard to rank against them. The difficulty is high. For all of these ones it’s going to be high because college and university terms are just incredibly lucrative.

That difficulty can help bias you against chasing after terms and phrases for which you are very unlikely to rank for at least early on. If you feel like, “Hey, I already have a powerful domain. I can rank for everything I want. I am the thousand pound gorilla in my space,” great. Go after the difficulty of your choice, but this helps prioritize.

Opportunity

This is actually very rarely used, but I think sophisticated marketers are using it extremely intelligently. Essentially what they’re saying is, “Hey, if you look at a set of search results, sometimes there are two or three ads at the top instead of just the ones on the sidebar, and that’s biasing some of the click-through rate curve.” Sometimes there’s an instant answer or a Knowledge Graph or a news box or images or video, or all these kinds of things that search results can be marked up with, that are not just the classic 10 web results. Unfortunately, if you’re building a spreadsheet like this and treating every single search result like it’s just 10 blue links, well you’re going to lose out. You’re missing the potential opportunity and the opportunity cost that comes with ads at the top or all of these kinds of features that will bias the click-through rate curve.

So what I’ve seen some really smart marketers do is essentially build some kind of a framework to say, “Hey, you know what? When we see that there’s a top ad and an instant answer, we’re saying the opportunity if I was ranking number 1 is not 10 out of 10. I don’t expect to get whatever the average traffic for the number 1 position is. I expect to get something considerably less than that. Maybe something around 60% of that, because of this instant answer and these top ads.” So I’m going to mark this opportunity as a 6 out of 10.

There are 2 top ads here, so I’m giving this a 7 out of 10. This has two top ads and then it has a news block below the first position. So again, I’m going to reduce that click-through rate. I think that’s going down to a 6 out of 10.

You can get more and less scientific and specific with this. Click-through rate curves are imperfect by nature because we truly can’t measure exactly how those things change. However, I think smart marketers can make some good assumptions from general click-through rate data, which there are several resources out there on that to build a model like this and then include it in their keyword research.

This does mean that you have to run a query for every keyword you’re thinking about, but you should be doing that anyway. You want to get a good look at who’s ranking in those search results and what kind of content they’re building . If you’re running a keyword difficulty tool, you are already getting something like that.

Business value

This is a classic one. Business value is essentially saying, “What’s it worth to us if visitors come through with this search term?” You can get that from bidding through AdWords. That’s the most sort of scientific, mathematically sound way to get it. Then, of course, you can also get it through your own intuition. It’s better to start with your intuition than nothing if you don’t already have AdWords data or you haven’t started bidding, and then you can refine your sort of estimate over time as you see search visitors visit the pages that are ranking, as you potentially buy those ads, and those kinds of things.

You can get more sophisticated around this. I think a 10 point scale is just fine. You could also use a one, two, or three there, that’s also fine.

Requirements or Options

Then I don’t exactly know what to call this column. I can’t remember the person who’ve showed me theirs that had it in there. I think they called it Optional Data or Additional SERPs Data, but I’m going to call it Requirements or Options. Requirements because this is essentially saying, “Hey, if I want to rank in these search results, am I seeing that the top two or three are all video? Oh, they’re all video. They’re all coming from YouTube. If I want to be in there, I’ve got to be video.”

Or something like, “Hey, I’m seeing that most of the top results have been produced or updated in the last six months. Google appears to be biasing to very fresh information here.” So, for example, if I were searching for “university scholarships Cambridge 2015,” well, guess what? Google probably wants to bias to show results that have been either from the official page on Cambridge’s website or articles from this year about getting into that university and the scholarships that are available or offered. I saw those in two of these search results, both the college and university scholarships had a significant number of the SERPs where a fresh bump appeared to be required. You can see that a lot because the date will be shown ahead of the description, and the date will be very fresh, sometime in the last six months or a year.

Prioritization

Then finally I can build my prioritization. So based on all the data I had here, I essentially said, “Hey, you know what? These are not 1 and 2. This is actually 1A and 1B, because these are the same concepts. I’m going to build a single page to target both of those keyword phrases.” I think that makes good sense. Someone who is looking for college scholarships, university scholarships, same intent.

I am giving it a slight prioritization, 1A versus 1B, and the reason I do this is because I always have one keyword phrase that I’m leaning on a little more heavily. Because Google isn’t perfect around this, the search results will be a little different. I want to bias to one versus the other. In this case, my title tag, since I more targeting university over college, I might say something like college and university scholarships so that university and scholarships are nicely together, near the front of the title, that kind of thing. Then 1B, 2, 3.

This is kind of the way that modern SEOs are building a more sophisticated process with better data, more inclusive data that helps them select the right kinds of keywords and prioritize to the right ones. I’m sure you guys have built some awesome stuff. The Moz community is filled with very advanced marketers, probably plenty of you who’ve done even more than this.

I look forward to hearing from you in the comments. I would love to chat more about this topic, and we’ll see you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 3 years ago from tracking.feedpress.it

Exposing The Generational Content Gap: Three Ways to Reach Multiple Generations

Posted by AndreaLehr

With more people of all ages online than ever before, marketers must create content that resonates with multiple generations. Successful marketers realize that each generation has unique expectations, values and experiences that influence consumer behaviors, and that offering your audience content that reflects their shared interests is a powerful way to connect with them and inspire them to take action.

We’re in the midst of a generational shift, with
Millennials expected to surpass Baby Boomers in 2015 as the largest living generation. In order to be competitive, marketers need to realize where key distinctions and similarities lie in terms of how these different generations consume content and share it with with others.

To better understand the habits of each generation,
BuzzStream and Fractl surveyed over 1,200 individuals and segmented their responses into three groups: Millennials (born between 1977–1995), Generation X (born between 1965–1976), and Baby Boomers (born between 1946–1964). [Eds note: The official breakdown for each group is as follows: Millennials (1981-1997), Generation X (1965-1980), and Boomers (1946-1964)]

Our survey asked them to identify their preferences for over 15 different content types while also noting their opinions on long-form versus short-form content and different genres (e.g., politics, technology, and entertainment).

We compared their responses and found similar habits and unique trends among all three generations.

Here’s our breakdown of the three key takeaways you can use to elevate your future campaigns:

1. Baby Boomers are consuming the most content

However, they have a tendency to enjoy it earlier in the day than Gen Xers and Millennials.

Although we found striking similarities between the younger generations, the oldest generation distinguished itself by consuming the most content. Over 25 percent of Baby Boomers consume 20 or more hours of content each week. Additional findings:

  • Baby Boomers also hold a strong lead in the 15–20 hours bracket at 17 percent, edging out Gen Xers and Millennials at 12 and 11 percent, respectively
  • A majority of Gen Xers and Millennials—just over 22 percent each—consume between 5 and 10 hours per week
  • Less than 10 percent of Gen Xers consume less than five hours of content a week—the lowest of all three groups

We also compared the times of day that each generation enjoys consuming content. The results show that most of our respondents—over 30 percent— consume content between 8 p.m. and midnight. However, there are similar trends that distinguish the oldest generation from the younger ones:

  • Baby Boomers consume a majority of their content in the morning. Nearly 40 percent of respondents are online between 5 a.m. and noon.
  • The least popular time for most respondents to engage with content online is late at night, between midnight and 5 a.m., earning less than 10 percent from each generation
  • Gen X is the only generation to dip below 10 percent in the three U.S. time zones: 5 a.m. to 9 a.m., 6 to 8 p.m., and midnight to 5 a.m.

When Do We Consume Content

When it comes to which device each generation uses to consume content, laptops are the most common, followed by desktops. The biggest distinction is in mobile usage: Over 50 percent of respondents who use their mobile as their primary device for content consumption are Millennials. Other results reveal:

  • Not only do Baby Boomers use laptops the most (43 percent), but they also use their tablets the most. (40 percent of all primary tablet users are Baby Boomers).
  • Over 25 percent of Millennials use a mobile device as their primary source for content
  • Gen Xers are the least active tablet users, with less than 8 percent of respondents using it as their primary device

Device To Consume Content2. Preferred content types and lengths span all three generations

One thing every generation agrees on is the type of content they enjoy seeing online. Our results reveal that the top four content types— blog articles, images, comments, and eBooks—are exactly the same for Baby Boomers, Gen Xers, and Millennials. Additional comparisons indicate:

  • The least preferred content types—flipbooks, SlideShares, webinars, and white papers—are the same across generations, too (although not in the exact same order)
  • Surprisingly, Gen Xers and Millennials list quizzes as one of their five least favorite content types

Most Consumed Content Type

All three generations also agree on ideal content length, around 300 words. Further analysis reveals:

  • Baby Boomers have the highest preference for articles under 200 words, at 18 percent
  • Gen Xers have a strong preference for articles over 500 words compared to other generations. Over 20 percent of respondents favor long-form articles, while only 15 percent of Baby Boomers and Millennials share the same sentiment.
  • Gen Xers also prefer short articles the least, with less than 10 percent preferring articles under 200 words

Content Length PreferencesHowever, in regards to verticals or genres, where they consume their content, each generation has their own unique preference:

  • Baby Boomers have a comfortable lead in world news and politics, at 18 percent and 12 percent, respectively
  • Millennials hold a strong lead in technology, at 18 percent, while Baby Boomers come in at 10 percent in the same category
  • Gen Xers fall between Millennials and Baby Boomers in most verticals, although they have slight leads in personal finance, parenting, and healthy living
  • Although entertainment is the top genre for each generation, Millennials and Baby Boomers prefer it slightly more than than Gen Xers do

Favorite Content Genres

3. Facebook is the preferred content sharing platform across all three generations

Facebook remains king in terms of content sharing, and is used by about 60 percent of respondents in each generation studied. Surprisingly, YouTube came in second, followed by Twitter, Google+, and LinkedIn, respectively. Additional findings:

  • Baby Boomers share on Facebook the most, edging out Millennials by only a fraction of a percent
  • Although Gen Xers use Facebook slightly less than other generations, they lead in both YouTube and Twitter, at 15 percent and 10 percent, respectively
  • Google+ is most popular with Baby Boomers, at 8 percent, nearly double that of both Gen Xers and Millennials

Preferred Social PlatformAlthough a majority of each generation is sharing content on Facebook, the type of content they are sharing, especially visuals, varies by each age group. The oldest generation prefers more traditional content, such as images and videos. Millennials prefer newer content types, such as memes and GIFs, while Gen X predictably falls in between the two generations in all categories except SlideShares. Other findings:

  • The most popular content type for Baby Boomers is video, at 27 percent
  • Parallax is the least popular type for every generation, earning 1 percent or less in each age group
  • Millennials share memes the most, while less than 10 percent of Baby Boomers share similar content

Most Shared Visual ContentMarketing to several generations can be challenging, given the different values and ideas that resonate with each group. With the number of online content consumers growing daily, it’s essential for marketers to understand the specific types of content that each of their audiences connect with, and align it with their content marketing strategy accordingly.

Although there is no one-size-fits-all campaign, successful marketers can create content that multiple generations will want to share. If you feel you need more information getting started, you can review this deck of additional insights, which includes the preferred video length and weekend consuming habits of each generation discussed in this post.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 3 years ago from tracking.feedpress.it

Has Google Gone Too Far with the Bias Toward Its Own Content?

Posted by ajfried

Since the beginning of SEO time, practitioners have been trying to crack the Google algorithm. Every once in a while, the industry gets a glimpse into how the search giant works and we have opportunity to deconstruct it. We don’t get many of these opportunities, but when we do—assuming we spot them in time—we try to take advantage of them so we can “fix the Internet.”

On Feb. 16, 2015, news started to circulate that NBC would start removing images and references of Brian Williams from its website.

This was it!

A golden opportunity.

This was our chance to learn more about the Knowledge Graph.

Expectation vs. reality

Often it’s difficult to predict what Google is truly going to do. We expect something to happen, but in reality it’s nothing like we imagined.

Expectation

What we expected to see was that Google would change the source of the image. Typically, if you hover over the image in the Knowledge Graph, it reveals the location of the image.

Keanu-Reeves-Image-Location.gif

This would mean that if the image disappeared from its original source, then the image displayed in the Knowledge Graph would likely change or even disappear entirely.

Reality (February 2015)

The only problem was, there was no official source (this changed, as you will soon see) and identifying where the image was coming from proved extremely challenging. In fact, when you clicked on the image, it took you to an image search result that didn’t even include the image.

Could it be? Had Google started its own database of owned or licensed images and was giving it priority over any other sources?

In order to find the source, we tried taking the image from the Knowledge Graph and “search by image” in images.google.com to find others like it. For the NBC Nightly News image, Google failed to even locate a match to the image it was actually using anywhere on the Internet. For other television programs, it was successful. Here is an example of what happened for Morning Joe:

Morning_Joe_image_search.png

So we found the potential source. In fact, we found three potential sources. Seemed kind of strange, but this seemed to be the discovery we were looking for.

This looks like Google is using someone else’s content and not referencing it. These images have a source, but Google is choosing not to show it.

Then Google pulled the ol’ switcheroo.

New reality (March 2015)

Now things changed and Google decided to put a source to their images. Unfortunately, I mistakenly assumed that hovering over an image showed the same thing as the file path at the bottom, but I was wrong. The URL you see when you hover over an image in the Knowledge Graph is actually nothing more than the title. The source is different.

Morning_Joe_Source.png

Luckily, I still had two screenshots I took when I first saw this saved on my desktop. Success. One screen capture was from NBC Nightly News, and the other from the news show Morning Joe (see above) showing that the source was changed.

NBC-nightly-news-crop.png

(NBC Nightly News screenshot.)

The source is a Google-owned property: gstatic.com. You can clearly see the difference in the source change. What started as a hypothesis in now a fact. Google is certainly creating a database of images.

If this is the direction Google is moving, then it is creating all kinds of potential risks for brands and individuals. The implications are a loss of control for any brand that is looking to optimize its Knowledge Graph results. As well, it seems this poses a conflict of interest to Google, whose mission is to organize the world’s information, not license and prioritize it.

How do we think Google is supposed to work?

Google is an information-retrieval system tasked with sourcing information from across the web and supplying the most relevant results to users’ searches. In recent months, the search giant has taken a more direct approach by answering questions and assumed questions in the Answer Box, some of which come from un-credited sources. Google has clearly demonstrated that it is building a knowledge base of facts that it uses as the basis for its Answer Boxes. When it sources information from that knowledge base, it doesn’t necessarily reference or credit any source.

However, I would argue there is a difference between an un-credited Answer Box and an un-credited image. An un-credited Answer Box provides a fact that is indisputable, part of the public domain, unlikely to change (e.g., what year was Abraham Lincoln shot? How long is the George Washington Bridge?) Answer Boxes that offer more than just a basic fact (or an opinion, instructions, etc.) always credit their sources.

There are four possibilities when it comes to Google referencing content:

  • Option 1: It credits the content because someone else owns the rights to it
  • Option 2: It doesn’t credit the content because it’s part of the public domain, as seen in some Answer Box results
  • Option 3: It doesn’t reference it because it owns or has licensed the content. If you search for “Chicken Pox” or other diseases, Google appears to be using images from licensed medical illustrators. The same goes for song lyrics, which Eric Enge discusses here: Google providing credit for content. This adds to the speculation that Google is giving preference to its own content by displaying it over everything else.
  • Option 4: It doesn’t credit the content, but neither does it necessarily own the rights to the content. This is a very gray area, and is where Google seemed to be back in February. If this were the case, it would imply that Google is “stealing” content—which I find hard to believe, but felt was necessary to include in this post for the sake of completeness.

Is this an isolated incident?

At Five Blocks, whenever we see these anomalies in search results, we try to compare the term in question against others like it. This is a categorization concept we use to bucket individuals or companies into similar groups. When we do this, we uncover some incredible trends that help us determine what a search result “should” look like for a given group. For example, when looking at searches for a group of people or companies in an industry, this grouping gives us a sense of how much social media presence the group has on average or how much media coverage it typically gets.

Upon further investigation of terms similar to NBC Nightly News (other news shows), we noticed the un-credited image scenario appeared to be a trend in February, but now all of the images are being hosted on gstatic.com. When we broadened the categories further to TV shows and movies, the trend persisted. Rather than show an image in the Knowledge Graph and from the actual source, Google tends to show an image and reference the source from Google’s own database of stored images.

And just to ensure this wasn’t a case of tunnel vision, we researched other categories, including sports teams, actors and video games, in addition to spot-checking other genres.

Unlike terms for specific TV shows and movies, terms in each of these other groups all link to the actual source in the Knowledge Graph.

Immediate implications

It’s easy to ignore this and say “Well, it’s Google. They are always doing something.” However, there are some serious implications to these actions:

  1. The TV shows/movies aren’t receiving their due credit because, from within the Knowledge Graph, there is no actual reference to the show’s official site
  2. The more Google moves toward licensing and then retrieving their own information, the more biased they become, preferring their own content over the equivalent—or possibly even superior—content from another source
  3. If feels wrong and misleading to get a Google Image Search result rather than an actual site because:
    • The search doesn’t include the original image
    • Considering how poor Image Search results are normally, it feels like a poor experience
  4. If Google is moving toward licensing as much content as possible, then it could make the Knowledge Graph infinitely more complicated when there is a “mistake” or something unflattering. How could one go about changing what Google shows about them?

Google is objectively becoming subjective

It is clear that Google is attempting to create databases of information, including lyrics stored in Google Play, photos, and, previously, facts in Freebase (which is now Wikidata and not owned by Google).

I am not normally one to point my finger and accuse Google of wrongdoing. But this really strikes me as an odd move, one bordering on a clear bias to direct users to stay within the search engine. The fact is, we trust Google with a heck of a lot of information with our searches. In return, I believe we should expect Google to return an array of relevant information for searchers to decide what they like best. The example cited above seems harmless, but what about determining which is the right religion? Or even who the prettiest girl in the world is?

Religion-and-beauty-queries.png

Questions such as these, which Google is returning credited answers for, could return results that are perceived as facts.

Should we next expect Google to decide who is objectively the best service provider (e.g., pizza chain, painter, or accountant), then feature them in an un-credited answer box? The direction Google is moving right now, it feels like we should be calling into question their objectivity.

But that’s only my (subjective) opinion.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 3 years ago from tracking.feedpress.it