SEO Blog - Internet marketing news and views  

Google SEO Report Card

Written by David Harry   
Monday, 19 April 2010 13:09

How a search engine does self analysis

A while back Google did a self analysis from an SEO perspective and Matt (Cutts) even did a session on it during a recent SMX session, (video at the end). I thought it might be an interesting excercise to actually go through it, point by point, to see what exactly they were considering important elements. It bears noting that they obviously aren't giving away the farm here, nor is this a Google how-to for SEO.

At the end of the day, much of this is second nature to search geeks... but there were a few interesting tidbits. It does make for some interesting reading and I felt it was worth noting... Enjoy!


Search Result Presentation

Title tag format and length

With this one, a known area of importance to SEO, they mention is being valuable to users (of course) as well as search engines. From a Geek point of view, I haven’t seen a lot of papers/patents on using and weighting the html TITLE element. At least not specifically.

The eventual TITLE format they advised has (approx) 60 chars as “search engines may give less weight to words after a certain point” and is formatted;

<TITLE> Product Name: Product keywords</TITLE>

This was over the product title alone and one with a non-descriptive tag line appended as well. They also stated that it should contain, “information like what the product does, who it targets, or what its main features are.”.

Take Away;

  1. Keep TITLE under 60 chars
  2. Speak to the Focus of the page
  3. Use Product/Service name where applicable
  4. Say what it does, who it targets or main features
  5. Value the real estate
  6. Do your KW research
  7. Use semi-colon separators?


Description meta tag use

Next up is the meta-d. Over the last while conventional wisdom says that they are more a function of click through rate than as a ranking signal. Even in the document they state; “description meta tags don't count in Google's ranking”. Good enough for me as it’s a noisy signal that is too easily abused.

In their example they do have the target term prominent in the meta description, but I am not personally putting too much stock in that.

What is likely more important in modern SEO, with Google at least, is that query analysis in a tighter personalized setting, means above threshold interactions and click data could be leveraged with enticing meta-descriptions. Much like content, it needs to be a good balance between engines and users.

They also note that the meta-d is “sometimes shown in the snippet”. This is important as I have many times, foolin’ about on blogs, left them out to see what Google will do. Something I shall do with this post just to see what Google uses for the snippet.

Oh and there was an interesting point about maxing out 2 lines of the snippet to keep competitors at bay. Nice one…. Here;

“By not having a description meta tag that fills both lines of the snippet, you're giving the results underneath you a slight advantage because they're now higher in the user's line of sight.”

There was also a note on diagnosing empty snippets. That one really should go without saying. If there is NO bloody snippet, it is time to start sorting out why. Don’t forget the robots.txt

Take Away;

  1. Aren’t a ranking signal
  2. Continue theme
  3. Consider CTR (entice)
  4. Remember NOODP as needed
  5. Not always used as snippet
  6. Less content, more likely to be used
  7. Light snippet raises competitor.


Google site-link triggering

Ok, so they next started getting into the site links element. This is said to give users a way to find information easier. Uh huh. Seems I’ve hear this before. Huh, strange. From what I know there is generally a tendency for named entities (brands, people, places) are the ones that get site links.

While the black box won’t tell us how, they hinted that we could “optimize (our) site's
organization and internal linking” for greater chances of achieving them.

One of my mates believes there is an element of click data from the SERPs at play. This came to mind when I read one of the benefits are to; “help users find the content they want faster”. I’m not going there quite yet.

Take Away;

  1. Webmasters can’t choose how/when
  2. Use a hierarchical site structure
  3. Use descriptive anchor text for links pointing to internal pages
  4. Avoid deep nesting of content behind many subdirectories
  5. Obviously improves search visibility


Appealing Google sitelinks

After looking at the triggers, they moved on to management. You can go to Google Webmaster Tools, should you dare, and block unappealing or unwanted site links. Going back to my mate’s curiosity of activity signals, they mention in this section;

“Occasionally, Google might choose some sitelinks that lead to popular and relevant content on your site”

Damn. That tin foil hat might actually work. Now, to keep some perspective, ‘popular’ could also means links, social signals… and more. The main point is there is some control over these via GWT. Always handy to know.

Take Away;

  1. Nuke unwanted sitelinks from GWT
  2. Tin foil hats may work as advertised (ha!)


Clear main page result on Google for entity/product/service

All this was about is having a simple, canonical result for a given query. One doesn’t want multiple canonical entry points in the SERP that may confuse users. They advise using 301 redirects or the canonical tag for these situations.

Many people ask me about these situations and using the NOINDEX route, which can be problematic and dynamic re-writes, 301s or when possible the canonical tag, is the best way to go.

And just as a reminder, in some cases these canonical issues, ones relating to duplicate content, can also end up diluting the over-all link equity of a given page.

Take Away;

  1. Have logical paths
  2. Determine canonical page
  3. Consolidate URLs
  4. Potential link equity dilution
  5. Don’t block 301 via Robots.txt


URLs and Redirects


Since we’re following the trail here, the next stop makes sense. Right away they get into how malformed URLs can cause some canonical issues;

“ is considered a different URL than by search engines”

So, right away we have to consider trailing slash issues. As with all canonical issues, one of the more potentially troubling areas is that internal/external link equity may be diluted. Then we can also consider users that type in the wrong url. It is something worth considering.

As discussed above, the preferred methods are 301 or the canonical tag. Since we do know there is some minor loss with 301, when possible, I’d look at the canonical route. Of course, I haven’t tested if there is any loss there, thus can’t be definitive.

  1. Choose the easiest to remember form of the URL as the canonical (likely
  2. Be consistent with this canonical form across all products
  3. Think of the most common URL forms visitors may try and 301 redirect these to the preferred/ - canonical URL or use the rel="canonical" link element if you cannot redirect


URL Format

In this section they start looking at the various issues they had with canonical elements, or in this case, the lack thereof. Of most interest is that they right away went back to the trailing slash issue. They were throwing up not only duplicates, but also some 404s where the server wasn’t handling it. They dealt with this via re-directs.

I find it curious there is still no mention of the fact that 301s lose some juice. Is this merely the easiest route? Or does the canonical also lose some value when being passed? Can’t say.

They sort of also touch on the ‘index.html’ issues, so it also bears mention here. Most SEOs are familiar with this one and it’s presence keeps it in the toolbox. That being said, there was no mention of the ‘www’ and ‘non-www’.

The only other noteworthy bits were on watching for canonical issues with https and http sections of a website. Once more, it can be seen as duplication and dilute the equity/reputation.

Take Away;

  1. Watch for trailing slash
  2. Keep an eye on 404s
  3. 301 or canonical tags suggested
  4. Consolidate reputation (PR and Authority?)
  5. 302 holds reputation on original page
  6. Watch for https and http
  7. Try and keep internal/external links consistent

On Page Elements

And so what do they consider to be on-page elements? Well, nothing earth shattering here…

Content keywords – Heading tags – Internal links (anchor text) – Image alt – Image file names.

As one would expect this is all about search engines ‘better understand’ the page and is great for users. There is no direct mention of ranking signals. Not at all surprising there.

Heading Tags

On heading tags they discuss it being a good element to set out the document structure. This is somewhat interesting as there are far reaching implications for this. I have linked up a few posts on page segmentation at the end of this piece.

The main point conveyed was that H1 tags tend to have more value and are used to further develop the theme/concept of the page.

Take Away

  1. Heading tags for document structure
  2. Heading can segment topics/concepts
  3. Don’t over-do terms in headers
  4. H1 for main focus (h2-3 for segmenting)


Moving along they next look at images. Of note right away, is to ensure that logo (in most header areas) is linked to the canonical version of the home (or main product page) of the document in question. Once more, this is generally a no-brainer, but worth at least mentioning here.

This goes back to the ‘index.html’ type issues, there is no mention of the www to non-www once more.

After that it’s on to the image alternative text which should be, “brief and descriptive”. In this case, looking at entity (person, product place) then we want to ensure that is in the alt. They also note that when the image is a link, this is essentially treated as the anchor text of the link.

  1. Ensure linked images point to canonical
  2. Use descriptive text (containing entity)
  3. Alt text is treated as anchor text when linked


Descriptive internal anchor text

This one surely must be one we all should know. Or at least, the power of anchor texts in general, over the last 5 years. If there is one thing that’s certain, if links are important, the anchor text is also paramount. This applies to external links and internal ones.

And since Google wouldn’t advise us to get targeted external link texts (manipulation), we can at least take to heart some of the advice geared towards internal anchor text best practices.

As with many things along this ride, we’re still theme building. The anchor text should be a consistent representation of what is on the destination page.

  1. Use descriptive anchors
  2. Avoid ‘click here’ , ‘learn more’ anchors
  3. Don’t link an entire sentence, a phrase words better

Did the earth move?

As we had imagined, or would have expected, there are no real ‘Ah Ha!’ moments here. But it does make a good refresher course. It also speaks to those that often ignore some of the benefits of being anal about the on-site and technical elements of an SEO program. For me, this just scratches the surface of what I look at with site audits, interesting none-the-less.

We also had hints in here of page segmentation, confirmations on things like meta-d and even a few open questions as far as 301s and canonicals go. Next time? They should let ME do an audit...hehe... wussies. Anyway, thanks for riding along, I hope ye enjoyed it.

Here's the SMX video Matt did related to all of this;


Now…some related reading;



Title and Meta Description;


Creating themes;


Site Links & Anchors;


301 and Canonical tag


Named Entities


Ranking factors


Google Stuff;

Want more SEO geekin?
SEO training community


Search the Site

SEO Training

Tools of the Trade


On Twitter

Follow me on Twitter

Site Designed by Verve Developments.