Darryl, the man behind dotmailer’s Custom Technical Solutions team

Why did you decide to come to dotmailer?

I first got to know dotmailer when the company was just a bunch of young enthusiastic web developers called Ellipsis Media back in 1999. I was introduced by one of my suppliers and we decided to bring them on board to build a recruitment website for one of our clients. That client was Amnesty International and the job role was Secretary General. Not bad for a Croydon company whose biggest client before that was Scobles the plumber’s merchants. So, I was probably dotmailer’s first ever corporate client! After that, I used dotmailer at each company I worked for and then one day they approached a colleague and me and asked us if we wanted to work for them. That was 2013.  We grabbed the opportunity with both hands and haven’t looked back since.

Tell us a bit about your role

I’m the Global Head of Technical Solutions which actually gives me responsibility for 2 teams. First, Custom Technical Solutions (CTS), who build bespoke applications and tools for customers that allow them to integrate more closely with dotmailer and make life easier. Second, Technical Pre-sales, which spans our 3 territories (EMEA, US and APAC) and works with prospective and existing clients to figure out the best solution and fit within dotmailer.

What accomplishments are you most proud of from your dotmailer time so far?

I would say so far it has to be helping to turn the CTS team from just 2 people into a group of 7 highly skilled and dedicated men and women who have become an intrinsic and valued part of the dotmailer organization. Also I really enjoy being part of the Senior Technical Management team. Here we have the ability to influence the direction and structure of the platform on a daily basis.

Meet Darryl Clark – the cheese and peanut butter sandwich lover

Can you speak a bit about your background and that of your team? What experience and expertise is required to join this team?

My background is quite diverse from a stint in the Army, through design college, web development, business analysis to heading up my current teams. I would say the most valuable skill that I have is being highly analytical. I love nothing more than listening to a client’s requirements and digging deep to work out how we can answer these if not exceed them.

As a team, we love nothing more than brainstorming our ideas. Every member has a valid input and we listen. Everyone has the opportunity to influence what we do and our motto is “there is no such thing as a stupid question.”

To work in my teams you have to be analytical but open minded to the fact that other people may have a better answer than you. Embrace other people’s input and use it to give our clients the best possible solution. We are hugely detail conscious, but have to be acutely aware that we need to tailor what we say to our audience so being able to talk to anyone at any level is hugely valuable.

How much of the dotmailer platform is easily customizable and when does it cross over into something that requires your team’s expertise? How much time is spent on these custom solutions one-time or ongoing?

I’ll let you in on a little secret here. We don’t actually do anything that our customers can’t do with dotmailer given the right knowledge and resources. This is because we build all of our solutions using the dotmailer public API. The API has hundreds of methods in both SOAP and REST versions, which allows you to do a huge amount with the dotmailer platform. We do have a vast amount of experience and knowledge in the team so we may well be able to build a solution quicker than our customers. We are more than happy to help them and their development teams build a solution using us on a consultancy basis to lessen the steepness of the learning curve.

Our aim when building a solution for a customer is that it runs silently in the background and does what it should without any fuss.

What are your plans for the Custom Tech Solutions team going forward?

The great thing about Custom Technical Solutions is you never know what is around the corner as our customers have very diverse needs. What we are concentrating on at the moment is refining our processes to ensure that they are as streamlined as possible and allow us to give as much information to the customer as we can. We are also always looking at the technology and coding approaches that we use to make sure that we build the most innovative and robust solutions.

We are also looking at our external marketing and sharing our knowledge through blogs so keep an eye on the website for our insights.

What are the most common questions that you get when speaking to a prospective customer?

Most questions seem to revolve around reassurance such as “Have you done this before?”, “How safe is my data?”, “What about security?”, “Can you talk to my developers?”, “Do I need to do anything?”.  In most instances, we are the ones asking the questions as we need to find out information as soon as possible so that we can analyse it to ensure that we have the right detail to provide the right solution.

Can you tell us about the dotmailer differentiators you highlight when speaking to prospective customers that seem to really resonate?

We talk a lot about working with best of breed so for example a customer can use our Channel Extensions in automation programs to fire out an SMS to a contact using their existing provider. We don’t force customers down one route, we like to let them decide for themselves.

Also, I really like to emphasize the fact that there is always more than one way to do something within the dotmailer platform. This means we can usually find a way to do something that works for a client within the platform. If not, then we call in CTS to work out if there is a way that we can build something that will — whether this is automating uploads for a small client or mass sending from thousands of child accounts for an enterprise level one.

What do you see as the future of marketing automation technology?  Will one size ever fit all? Or more customization going forward?

The 64 million dollar question. One size will never fit all. Companies and their systems are too organic for that. There isn’t one car that suits every driver or one racquet that suits every sport. Working with a top drawer partner network and building our system to be as open as possible from an integration perspective means that our customers can make dotmailer mold to their business and not the other way round…and adding to that the fact that we are building lots of features in the platform that will blow your socks off.

Tell us a bit about yourself – favorite sports team, favorite food, guilty pleasure, favorite band, favorite vacation spot?

I’m a dyed in the wool Gooner (aka Arsenal Football Club fan) thanks to my Grandfather leading me down the right path as a child. If you are still reading this after that bombshell, then food-wise I pretty much like everything apart from coriander which as far as I’m concerned is the Devils own spawn. I don’t really have a favorite band, but am partial to a bit of Level 42 and Kings of Leon and you will also find me listening to 90s drum and bass and proper old school hip hop. My favorite holiday destination is any decent villa that I can relax in and spend time with my family and I went to Paris recently and loved that. Guilty pleasure – well that probably has to be confessing to liking Coldplay or the fact that my favorite sandwich is peanut butter, cheese and salad cream. Go on try it, you’ll love it.

Want to meet more of the dotmailer team? Say hi to Darren Hockley, Global Head of Support, and Dan Morris, EVP for North America.

Reblogged 3 years ago from blog.dotmailer.com

dotmailer becomes EU-U.S. Privacy Shield certified

On 12 August we were accepted for the U.S. Department of Commerce’s voluntary privacy certification program. The news is a great milestone for dotmailer, because it recognizes the years of work we’ve put into protecting our customers’ data and privacy. For instance, just look at our comprehensive trust center and involvement in both the International Association of Privacy Professionals (IAPP) and Email Sender & Provider Coalition (ESPC).

To become certified our Chief Privacy Officer, James Koons, made the application to the U.S. Department of Commerce, who audited dotmailer’s privacy statement. (Interesting fact: James actually completed the application process while on vacation climbing Mt. Rainer in Washington state!)

By self-certifying and agreeing to the Privacy Shield Principles, it means that our commitment is enforceable under the Federal Trade Commission (FTC).

What does it mean for you (our customers)?

As we continue to expand globally, this certification is one more important privacy precedent. The aim of the EU-U.S. Privacy Shield, which was recently finalized, provides businesses with stronger protection for the exchange of transatlantic data. If you haven’t seen it already, you might be interested in reading about the recent email privacy war between Microsoft and the U.S. government.

As a certified company, it means we must provide you with adequate privacy protection – a requirement for the transfer of personal data outside of the European Union under the EU Data Protection Directive. Each year, we must self-certify to the U.S. Department of Commerce’s International Trade Administration (ITA), to ensure we adhere to the Privacy Shield Principles.

What does our Chief Privacy Officer think?

James Koons, who has 20 years’ experience in the information systems and security industry, explained why he’s pleased about the news: “I am delighted that dotmailer has been recognized as a good steward of data through the Privacy Shield Certification.

“As a company that has a culture of privacy and security as its core, I believe the certification simply highlights the great work we have already been doing.”

What happened to the Safe Harbour agreement?

The EU-U.S. Privacy Shield replaces the former Safe Harbour agreement for transatlantic data transfers.

Want to know more about what the Privacy Shield means?

You can check out the official Privacy Shield website here, which gives a more detailed overview of the program and requirements for participating organizations.

Reblogged 3 years ago from blog.dotmailer.com

Big Data, Big Problems: 4 Major Link Indexes Compared

Posted by russangular

Given this blog’s readership, chances are good you will spend some time this week looking at backlinks in one of the growing number of link data tools. We know backlinks continue to be one of, if not the most important
parts of Google’s ranking algorithm. We tend to take these link data sets at face value, though, in part because they are all we have. But when your rankings are on the line, is there a better way to get at which data set is the best? How should we go
about assessing these different link indexes like
Moz,
Majestic, Ahrefs and SEMrush for quality? Historically, there have been 4 common approaches to this question of index quality…

  • Breadth: We might choose to look at the number of linking root domains any given service reports. We know
    that referring domains correlates strongly with search rankings, so it makes sense to judge a link index by how many unique domains it has
    discovered and indexed.
  • Depth: We also might choose to look at how deep the web has been crawled, looking more at the total number of URLs
    in the index, rather than the diversity of referring domains.
  • Link Overlap: A more sophisticated approach might count the number of links an index has in common with Google Webmaster
    Tools.
  • Freshness: Finally, we might choose to look at the freshness of the index. What percentage of links in the index are
    still live?

There are a number of really good studies (some newer than others) using these techniques that are worth checking out when you get a chance:

  • BuiltVisible analysis of Moz, Majestic, GWT, Ahrefs and Search Metrics
  • SEOBook comparison of Moz, Majestic, Ahrefs, and Ayima
  • MatthewWoodward
    study of Ahrefs, Majestic, Moz, Raven and SEO Spyglass
  • Marketing Signals analysis of Moz, Majestic, Ahrefs, and GWT
  • RankAbove comparison of Moz, Majestic, Ahrefs and Link Research Tools
  • StoneTemple study of Moz and Majestic

While these are all excellent at addressing the methodologies above, there is a particular limitation with all of them. They miss one of the
most important metrics we need to determine the value of a link index: proportional representation to Google’s link graph
. So here at Angular Marketing, we decided to take a closer look.

Proportional representation to Google Search Console data

So, why is it important to determine proportional representation? Many of the most important and valued metrics we use are built on proportional
models. PageRank, MozRank, CitationFlow and Ahrefs Rank are proportional in nature. The score of any one URL in the data set is relative to the
other URLs in the data set. If the data set is biased, the results are biased.

A Visualization

Link graphs are biased by their crawl prioritization. Because there is no full representation of the Internet, every link graph, even Google’s,
is a biased sample of the web. Imagine for a second that the picture below is of the web. Each dot represents a page on the Internet,
and the dots surrounded by green represent a fictitious index by Google of certain sections of the web.

Of course, Google isn’t the only organization that crawls the web. Other organizations like Moz,
Majestic, Ahrefs, and SEMrush
have their own crawl prioritizations which result in different link indexes.

In the example above, you can see different link providers trying to index the web like Google. Link data provider 1 (purple) does a good job
of building a model that is similar to Google. It isn’t very big, but it is proportional. Link data provider 2 (blue) has a much larger index,
and likely has more links in common with Google that link data provider 1, but it is highly disproportional. So, how would we go about measuring
this proportionality? And which data set is the most proportional to Google?

Methodology

The first step is to determine a measurement of relativity for analysis. Google doesn’t give us very much information about their link graph.
All we have is what is in Google Search Console. The best source we can use is referring domain counts. In particular, we want to look at
what we call
referring domain link pairs. A referring domain link pair would be something like ask.com->mlb.com: 9,444 which means
that ask.com links to mlb.com 9,444 times.

Steps

  1. Determine the root linking domain pairs and values to 100+ sites in Google Search Console
  2. Determine the same for Ahrefs, Moz, Majestic Fresh, Majestic Historic, SEMrush
  3. Compare the referring domain link pairs of each data set to Google, assuming a
    Poisson Distribution
  4. Run simulations of each data set’s performance against each other (ie: Moz vs Maj, Ahrefs vs SEMrush, Moz vs SEMrush, et al.)
  5. Analyze the results

Results

When placed head-to-head, there seem to be some clear winners at first glance. In head-to-head, Moz edges out Ahrefs, but across the board, Moz and Ahrefs fare quite evenly. Moz, Ahrefs and SEMrush seem to be far better than Majestic Fresh and Majestic Historic. Is that really the case? And why?

It turns out there is an inversely proportional relationship between index size and proportional relevancy. This might seem counterintuitive,
shouldn’t the bigger indexes be closer to Google? Not Exactly.

What does this mean?

Each organization has to create a crawl prioritization strategy. When you discover millions of links, you have to prioritize which ones you
might crawl next. Google has a crawl prioritization, so does Moz, Majestic, Ahrefs and SEMrush. There are lots of different things you might
choose to prioritize…

  • You might prioritize link discovery. If you want to build a very large index, you could prioritize crawling pages on sites that
    have historically provided new links.
  • You might prioritize content uniqueness. If you want to build a search engine, you might prioritize finding pages that are unlike
    any you have seen before. You could choose to crawl domains that historically provide unique data and little duplicate content.
  • You might prioritize content freshness. If you want to keep your search engine recent, you might prioritize crawling pages that
    change frequently.
  • You might prioritize content value, crawling the most important URLs first based on the number of inbound links to that page.

Chances are, an organization’s crawl priority will blend some of these features, but it’s difficult to design one exactly like Google. Imagine
for a moment that instead of crawling the web, you want to climb a tree. You have to come up with a tree climbing strategy.

  • You decide to climb the longest branch you see at each intersection.
  • One friend of yours decides to climb the first new branch he reaches, regardless of how long it is.
  • Your other friend decides to climb the first new branch she reaches only if she sees another branch coming off of it.

Despite having different climb strategies, everyone chooses the same first branch, and everyone chooses the same second branch. There are only
so many different options early on.

But as the climbers go further and further along, their choices eventually produce differing results. This is exactly the same for web crawlers
like Google, Moz, Majestic, Ahrefs and SEMrush. The bigger the crawl, the more the crawl prioritization will cause disparities. This is not a
deficiency; this is just the nature of the beast. However, we aren’t completely lost. Once we know how index size is related to disparity, we
can make some inferences about how similar a crawl priority may be to Google.

Unfortunately, we have to be careful in our conclusions. We only have a few data points with which to work, so it is very difficult to be
certain regarding this part of the analysis. In particular, it seems strange that Majestic would get better relative to its index size as it grows,
unless Google holds on to old data (which might be an important discovery in and of itself). It is most likely that at this point we can’t make
this level of conclusion.

So what do we do?

Let’s say you have a list of domains or URLs for which you would like to know their relative values. Your process might look something like
this…

  • Check Open Site Explorer to see if all URLs are in their index. If so, you are looking metrics most likely to be proportional to Google’s link graph.
  • If any of the links do not occur in the index, move to Ahrefs and use their Ahrefs ranking if all you need is a single PageRank-like metric.
  • If any of the links are missing from Ahrefs’s index, or you need something related to trust, move on to Majestic Fresh.
  • Finally, use Majestic Historic for (by leaps and bounds) the largest coverage available.

It is important to point out that the likelihood that all the URLs you want to check are in a single index increases as the accuracy of the metric
decreases. Considering the size of Majestic’s data, you can’t ignore them because you are less likely to get null value answers from their data than
the others. If anything rings true, it is that once again it makes sense to get data
from as many sources as possible. You won’t
get the most proportional data without Moz, the broadest data without Majestic, or everything in-between without Ahrefs.

What about SEMrush? They are making progress, but they don’t publish any relative statistics that would be useful in this particular
case. Maybe we can hope to see more from them soon given their already promising index!

Recommendations for the link graphing industry

All we hear about these days is big data; we almost never hear about good data. I know that the teams at Moz,
Majestic, Ahrefs, SEMrush and others are interested in mimicking Google, but I would love to see some organization stand up against the
allure of
more data in favor of better data—data more like Google’s. It could begin with testing various crawl strategies to see if they produce
a result more similar to that of data shared in Google Search Console. Having the most Google-like data is certainly a crown worth winning.

Credits

Thanks to Diana Carter at Angular for assistance with data acquisition and Andrew Cron with statistical analysis. Thanks also to the representatives from Moz, Majestic, Ahrefs, and SEMrush for answering questions about their indices.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

How to Rid Your Website of Six Common Google Analytics Headaches

Posted by amandaecking

I’ve been in and out of Google Analytics (GA) for the past five or so years agency-side. I’ve seen three different code libraries, dozens of new different features and reports roll out, IP addresses stop being reported, and keywords not-so-subtly phased out of the free platform.

Analytics has been a focus of mine for the past year or so—mainly, making sure clients get their data right. Right now, our new focus is closed loop tracking, but that’s a topic for another day. If you’re using Google Analytics, and only Google Analytics for the majority of your website stats, or it’s your primary vehicle for analysis, you need to make sure it’s accurate.

Not having data pulling in or reporting properly is like building a house on a shaky foundation: It doesn’t end well. Usually there are tears.

For some reason, a lot of people, including many of my clients, assume everything is tracking properly in Google Analytics… because Google. But it’s not Google who sets up your analytics. People do that. And people are prone to make mistakes.

I’m going to go through six scenarios where issues are commonly encountered with Google Analytics.

I’ll outline the remedy for each issue, and in the process, show you how to move forward with a diagnosis or resolution.

1. Self-referrals

This is probably one of the areas we’re all familiar with. If you’re seeing a lot of traffic from your own domain, there’s likely a problem somewhere—or you need to extend the default session length in Google Analytics. (For example, if you have a lot of long videos or music clips and don’t use event tracking; a website like TEDx or SoundCloud would be a good equivalent.)

Typically one of the first things I’ll do to help diagnose the problem is include an advanced filter to show the full referrer string. You do this by creating a filter, as shown below:

Filter Type: Custom filter > Advanced
Field A: Hostname
Extract A: (.*)
Field B: Request URI
Extract B: (.*)
Output To: Request URI
Constructor: $A1$B1

You’ll then start seeing the subdomains pulling in. Experience has shown me that if you have a separate subdomain hosted in another location (say, if you work with a separate company and they host and run your mobile site or your shopping cart), it gets treated by Google Analytics as a separate domain. Thus, you ‘ll need to implement cross domain tracking. This way, you can narrow down whether or not it’s one particular subdomain that’s creating the self-referrals.

In this example below, we can see all the revenue is being reported to the booking engine (which ended up being cross domain issues) and their own site is the fourth largest traffic source:

I’ll also a good idea to check the browser and device reports to start narrowing down whether the issue is specific to a particular element. If it’s not, keep digging. Look at pages pulling the self-referrals and go through the code with a fine-tooth comb, drilling down as much as you can.

2. Unusually low bounce rate

If you have a crazy-low bounce rate, it could be too good to be true. Unfortunately. An unusually low bounce rate could (and probably does) mean that at least on some pages of your website have the same Google Analytics tracking code installed twice.

Take a look at your source code, or use Google Tag Assistant (though it does have known bugs) to see if you’ve got GA tracking code installed twice.

While I tell clients having Google Analytics installed on the same page can lead to double the pageviews, I’ve not actually encountered that—I usually just say it to scare them into removing the duplicate implementation more quickly. Don’t tell on me.

3. Iframes anywhere

I’ve heard directly from Google engineers and Google Analytics evangelists that Google Analytics does not play well with iframes, and that it will never will play nice with this dinosaur technology.

If you track the iframe, you inflate your pageviews, plus you still aren’t tracking everything with 100% clarity.

If you don’t track across iframes, you lose the source/medium attribution and everything becomes a self-referral.

Damned if you do; damned if you don’t.

My advice: Stop using iframes. They’re Netscape-era technology anyway, with rainbow marquees and Comic Sans on top. Interestingly, and unfortunately, a number of booking engines (for hotels) and third-party carts (for ecommerce) still use iframes.

If you have any clients in those verticals, or if you’re in the vertical yourself, check with your provider to see if they use iframes. Or you can check for yourself, by right-clicking as close as you can to the actual booking element:

iframe-booking.png

There is no neat and tidy way to address iframes with Google Analytics, and usually iframes are not the only complicated element of setup you’ll encounter. I spent eight months dealing with a website on a subfolder, which used iframes and had a cross domain booking system, and the best visibility I was able to get was about 80% on a good day.

Typically, I’d approach diagnosing iframes (if, for some reason, I had absolutely no access to viewing a website or talking to the techs) similarly to diagnosing self-referrals, as self-referrals are one of the biggest symptoms of iframe use.

4. Massive traffic jumps

Massive jumps in traffic don’t typically just happen. (Unless, maybe, you’re Geraldine.) There’s always an explanation—a new campaign launched, you just turned on paid ads for the first time, you’re using content amplification platforms, you’re getting a ton of referrals from that recent press in The New York Times. And if you think it just happened, it’s probably a technical glitch.

I’ve seen everything from inflated pageviews result from including tracking on iframes and unnecessary implementation of virtual pageviews, to not realizing the tracking code was installed on other microsites for the same property. Oops.

Usually I’ve seen this happen when the tracking code was somewhere it shouldn’t be, so if you’re investigating a situation of this nature, first confirm the Google Analytics code is only in the places it needs to be.Tools like Google Tag Assistant and Screaming Frog can be your BFFs in helping you figure this out.

Also, I suggest bribing the IT department with sugar (or booze) to see if they’ve changed anything lately.

5. Cross-domain tracking

I wish cross-domain tracking with Google Analytics out of the box didn’t require any additional setup. But it does.

If you don’t have it set up properly, things break down quickly, and can be quite difficult to untangle.

The older the GA library you’re using, the harder it is. The easiest setup, by far, is Google Tag Manager with Universal Analytics. Hard-coded universal analytics is a bit more difficult because you have to implement autoLink manually and decorate forms, if you’re using them (and you probably are). Beyond that, rather than try and deal with it, I say update your Google Analytics code. Then we can talk.

Where I’ve seen the most murkiness with tracking is when parts of cross domain tracking are implemented, but not all. For some reason, if allowLinker isn’t included, or you forget to decorate all the forms, the cookies aren’t passed between domains.

The absolute first place I would start with this would be confirming the cookies are all passing properly at all the right points, forms, links, and smoke signals. I’ll usually use a combination of the Real Time report in Google Analytics, Google Tag Assistant, and GA debug to start testing this. Any debug tool you use will mean you’re playing in the console, so get friendly with it.

6. Internal use of UTM strings

I’ve saved the best for last. Internal use of campaign tagging. We may think, oh, I use Google to tag my campaigns externally, and we’ve got this new promotion on site which we’re using a banner ad for. That’s a campaign. Why don’t I tag it with a UTM string?

Step away from the keyboard now. Please.

When you tag internal links with UTM strings, you override the original source/medium. So that visitor who came in through your paid ad and then who clicks on the campaign banner has now been manually tagged. You lose the ability to track that they came through on the ad the moment they click on the tagged internal link. Their source and medium is now your internal campaign, not that paid ad you’re spending gobs of money on and have to justify to your manager. See the problem?

I’ve seen at least three pretty spectacular instances of this in the past year, and a number of smaller instances of it. Annie Cushing also talks about the evils of internal UTM tags and the odd prevalence of it. (Oh, and if you haven’t explored her blog, and the amazing spreadsheets she shares, please do.)

One clothing company I worked with tagged all of their homepage offers with UTM strings, which resulted in the loss of visibility for one-third of their audience: One million visits over the course of a year, and $2.1 million in lost revenue.

Let me say that again. One million visits, and $2.1 million. That couldn’t be attributed to an external source/campaign/spend.

Another client I audited included campaign tagging on nearly every navigational element on their website. It still gives me nightmares.

If you want to see if you have any internal UTM strings, head straight to the Campaigns report in Acquisition in Google Analytics, and look for anything like “home” or “navigation” or any language you may use internally to refer to your website structure.

And if you want to see how users are moving through your website, go to the Flow reports. Or if you really, really, really want to know how many people click on that sidebar link, use event tracking. But please, for the love of all things holy (and to keep us analytics lovers from throwing our computers across the room), stop using UTM tagging on your internal links.

Now breathe and smile

Odds are, your Google Analytics setup is fine. If you are seeing any of these issues, though, you have somewhere to start in diagnosing and addressing the data.

We’ve looked at six of the most common points of friction I’ve encountered with Google Analytics and how to start investigating them: self-referrals, bounce rate, iframes, traffic jumps, cross domain tracking and internal campaign tagging.

What common data integrity issues have you encountered with Google Analytics? What are your favorite tools to investigate?

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Has Google Gone Too Far with the Bias Toward Its Own Content?

Posted by ajfried

Since the beginning of SEO time, practitioners have been trying to crack the Google algorithm. Every once in a while, the industry gets a glimpse into how the search giant works and we have opportunity to deconstruct it. We don’t get many of these opportunities, but when we do—assuming we spot them in time—we try to take advantage of them so we can “fix the Internet.”

On Feb. 16, 2015, news started to circulate that NBC would start removing images and references of Brian Williams from its website.

This was it!

A golden opportunity.

This was our chance to learn more about the Knowledge Graph.

Expectation vs. reality

Often it’s difficult to predict what Google is truly going to do. We expect something to happen, but in reality it’s nothing like we imagined.

Expectation

What we expected to see was that Google would change the source of the image. Typically, if you hover over the image in the Knowledge Graph, it reveals the location of the image.

Keanu-Reeves-Image-Location.gif

This would mean that if the image disappeared from its original source, then the image displayed in the Knowledge Graph would likely change or even disappear entirely.

Reality (February 2015)

The only problem was, there was no official source (this changed, as you will soon see) and identifying where the image was coming from proved extremely challenging. In fact, when you clicked on the image, it took you to an image search result that didn’t even include the image.

Could it be? Had Google started its own database of owned or licensed images and was giving it priority over any other sources?

In order to find the source, we tried taking the image from the Knowledge Graph and “search by image” in images.google.com to find others like it. For the NBC Nightly News image, Google failed to even locate a match to the image it was actually using anywhere on the Internet. For other television programs, it was successful. Here is an example of what happened for Morning Joe:

Morning_Joe_image_search.png

So we found the potential source. In fact, we found three potential sources. Seemed kind of strange, but this seemed to be the discovery we were looking for.

This looks like Google is using someone else’s content and not referencing it. These images have a source, but Google is choosing not to show it.

Then Google pulled the ol’ switcheroo.

New reality (March 2015)

Now things changed and Google decided to put a source to their images. Unfortunately, I mistakenly assumed that hovering over an image showed the same thing as the file path at the bottom, but I was wrong. The URL you see when you hover over an image in the Knowledge Graph is actually nothing more than the title. The source is different.

Morning_Joe_Source.png

Luckily, I still had two screenshots I took when I first saw this saved on my desktop. Success. One screen capture was from NBC Nightly News, and the other from the news show Morning Joe (see above) showing that the source was changed.

NBC-nightly-news-crop.png

(NBC Nightly News screenshot.)

The source is a Google-owned property: gstatic.com. You can clearly see the difference in the source change. What started as a hypothesis in now a fact. Google is certainly creating a database of images.

If this is the direction Google is moving, then it is creating all kinds of potential risks for brands and individuals. The implications are a loss of control for any brand that is looking to optimize its Knowledge Graph results. As well, it seems this poses a conflict of interest to Google, whose mission is to organize the world’s information, not license and prioritize it.

How do we think Google is supposed to work?

Google is an information-retrieval system tasked with sourcing information from across the web and supplying the most relevant results to users’ searches. In recent months, the search giant has taken a more direct approach by answering questions and assumed questions in the Answer Box, some of which come from un-credited sources. Google has clearly demonstrated that it is building a knowledge base of facts that it uses as the basis for its Answer Boxes. When it sources information from that knowledge base, it doesn’t necessarily reference or credit any source.

However, I would argue there is a difference between an un-credited Answer Box and an un-credited image. An un-credited Answer Box provides a fact that is indisputable, part of the public domain, unlikely to change (e.g., what year was Abraham Lincoln shot? How long is the George Washington Bridge?) Answer Boxes that offer more than just a basic fact (or an opinion, instructions, etc.) always credit their sources.

There are four possibilities when it comes to Google referencing content:

  • Option 1: It credits the content because someone else owns the rights to it
  • Option 2: It doesn’t credit the content because it’s part of the public domain, as seen in some Answer Box results
  • Option 3: It doesn’t reference it because it owns or has licensed the content. If you search for “Chicken Pox” or other diseases, Google appears to be using images from licensed medical illustrators. The same goes for song lyrics, which Eric Enge discusses here: Google providing credit for content. This adds to the speculation that Google is giving preference to its own content by displaying it over everything else.
  • Option 4: It doesn’t credit the content, but neither does it necessarily own the rights to the content. This is a very gray area, and is where Google seemed to be back in February. If this were the case, it would imply that Google is “stealing” content—which I find hard to believe, but felt was necessary to include in this post for the sake of completeness.

Is this an isolated incident?

At Five Blocks, whenever we see these anomalies in search results, we try to compare the term in question against others like it. This is a categorization concept we use to bucket individuals or companies into similar groups. When we do this, we uncover some incredible trends that help us determine what a search result “should” look like for a given group. For example, when looking at searches for a group of people or companies in an industry, this grouping gives us a sense of how much social media presence the group has on average or how much media coverage it typically gets.

Upon further investigation of terms similar to NBC Nightly News (other news shows), we noticed the un-credited image scenario appeared to be a trend in February, but now all of the images are being hosted on gstatic.com. When we broadened the categories further to TV shows and movies, the trend persisted. Rather than show an image in the Knowledge Graph and from the actual source, Google tends to show an image and reference the source from Google’s own database of stored images.

And just to ensure this wasn’t a case of tunnel vision, we researched other categories, including sports teams, actors and video games, in addition to spot-checking other genres.

Unlike terms for specific TV shows and movies, terms in each of these other groups all link to the actual source in the Knowledge Graph.

Immediate implications

It’s easy to ignore this and say “Well, it’s Google. They are always doing something.” However, there are some serious implications to these actions:

  1. The TV shows/movies aren’t receiving their due credit because, from within the Knowledge Graph, there is no actual reference to the show’s official site
  2. The more Google moves toward licensing and then retrieving their own information, the more biased they become, preferring their own content over the equivalent—or possibly even superior—content from another source
  3. If feels wrong and misleading to get a Google Image Search result rather than an actual site because:
    • The search doesn’t include the original image
    • Considering how poor Image Search results are normally, it feels like a poor experience
  4. If Google is moving toward licensing as much content as possible, then it could make the Knowledge Graph infinitely more complicated when there is a “mistake” or something unflattering. How could one go about changing what Google shows about them?

Google is objectively becoming subjective

It is clear that Google is attempting to create databases of information, including lyrics stored in Google Play, photos, and, previously, facts in Freebase (which is now Wikidata and not owned by Google).

I am not normally one to point my finger and accuse Google of wrongdoing. But this really strikes me as an odd move, one bordering on a clear bias to direct users to stay within the search engine. The fact is, we trust Google with a heck of a lot of information with our searches. In return, I believe we should expect Google to return an array of relevant information for searchers to decide what they like best. The example cited above seems harmless, but what about determining which is the right religion? Or even who the prettiest girl in the world is?

Religion-and-beauty-queries.png

Questions such as these, which Google is returning credited answers for, could return results that are perceived as facts.

Should we next expect Google to decide who is objectively the best service provider (e.g., pizza chain, painter, or accountant), then feature them in an un-credited answer box? The direction Google is moving right now, it feels like we should be calling into question their objectivity.

But that’s only my (subjective) opinion.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Reblogged 4 years ago from tracking.feedpress.it

Hire SEO expert (Covetus.com)

Covetus.com is a leading SEO and internet Marketing company in Dallas. Dallas best seo provider is another name of Covetus.com. Covetus.com is name where peo…

Reblogged 4 years ago from www.youtube.com