The other day I was thinking about what signals we consider when trying to understand search engines and how they rank pages. There have been more than a few folks that have taken a stab at the elusive over 200 ranking factors the last few years (not to mention local rankings); why not give it a go? For me, it is likely a utopian fantasy, a journey into supposition, more than it is any real insight
but lets have a go at it anyway.
Now let me start by saying I am not a fan, as many of you know, of making definitive statements about search algos as... well... we just don't know. We're all used to the catch phrases Google and Algorithm - but let's not go there. The main reason is that there are likely a multitude of algorithms that serve different purposes; so this isn't about any single algo.. Furthermore, I am not talking specifics such as Google per se, but more of an amalgam of the Big 3 and various approaches that may be employed by search engines in 2009 and beyond. OK? With me?
Let's give it a go
were here to have some fun right?
Factors affecting search engine rankings
In truth, we could call this Ranking and re-ranking factors because there is a great deal of re-ranking going on depending which filters/methods are applied. There are some cross-over areas.
This list is certainly nowhere near the 200+ ranking factors that search engines such as Google always talks about, because were generalizing. Each of the points were going to look at is made of filters and elements that could each be expanded. To be honest, wed likely come up with well over 200 if we broke them down.
And so without further ado; My Search ranking factors for 2009
Link related factors
The granddaddy of all modern indexing and retrieval approaches, links are still one of the more dominant signals out there. The interesting part is that a straight PageRank approach is not the only show in town. From historical ranking factors to phrase based or even Personalized PageRank
there are a few ways to valuate links in the ranking process.
- PageRank (or relative nodal link valuation)
- Link text (internal and external)
- Link relevance (global and page)
- Also see Temporal, Personalized PageRank and Phrase factors.
Thoughts; when it comes to links (internal or external or inbound) we want to build upon the theme of our core target terms. Understanding the core concepts of nodal graph link systems (such as PageRank) is mandatory to performing SEO learn it well. Also, never forget that themes can be built with internal link texts; internal link structures are key.
These are the elements related to the tags in you page header code. While the big 3 all use them, the degree and weighting varies.
- Page TITLE tag
- Meta-description tag
While page TITLEs can still be seen as carrying some reasonable weight, the description tag may be more a function of CTR and the world of behavioural metrics; it's actual ranking value is suspect in modern search. As far as the meta-keyword tag? I aint even going there
These are also some of the more time tested signals in the IR world. All of the major engines have papers relating to historical factors and a Googler once coined the term Query Deserves Freshnss to describe how they try to establish which types of content are more affected by the passage of time. There is also concepts relating to link velocity and links in general can be analyzed through the hour glass of temporal factors
- Document inception/age data
- Link velocity
- Link age
- Viral/Current news (QDF)
- Time of year (niche trends)
- Content update rate
Thoughts; never discount the value of these. You should monitor competitors and top ranking pages to establish the mean temporal averages of links, content changes and seasonal anomalies. You should also be sure to have content updates planned for important pages and capitalize on breaking news/events. Here are some interesting temporal ranking patents and some on tips on link velocity
Trust related factors
- Domain history
- Inbound links (global)
- Outbound links
- Named entities (products, brand, author)
- Contact information (also important for geographic signals)
Thoughts; you should make sure to ALWAYs be careful whom you link to and any other outbound links on your site (such as blog comments). If youre actively seeking links, try to establish links from authority pages. Also be sure to list proper contact information from your site to encourage a sense of legitimacy. In short, youre building a profile of trust and apparent respectability. If you have brand names, trademarks or authors; be sure to make them cited around the web. Of interest, you could read up on Yahoo's HarmonicRank
Geo-location and local search
One thing we do know these days is that search engines are getting better at and are more and more interested in; geo-graphic triggers. If its personalization, or geo-triggered and even device centric, theyre looking at it.
- Location of client device
- Location of webpage hosting
- Contact / location information
- Inbound/outbound link geo-factors
- Linguistic indicators (language and nuances)
Thoughts; ensure that you create the proper profile for the markets youre trying to reach. This is most important for web sites/pages that are targeting local markets. Furthermore, it wouldnt hurt to learn more about SEO for mobile in the years ahead. For more see the recent geo-targeting checklist.
While not a huge factor, prominence factors are a good idea in any SEO program. These are elements such as heading tags, bold and so on. It should be noted that of the 3 main search engines, Yahoo tends to value these a little more than the others.
- Heading (H1-5)
- Font attributes (size, color)
Thoughts; be sure to use these goodies in your content, because regardless of actual weight, they are important in developing concepts and themes for your page. Do NOT simply stuff these elements with primary keywords, be more natural and highlight primary and secondary concepts. Good page formatting also encourages better user engagement; an added bonus.
Phrase based or semantic concepts (relevance)
This is more of a grey area, as were not entirely sure which methods are being employed these days, but they are worthy of consideration. Much of the work for creating themes and concepts can be found in these signals.
- Related phrase ratios
- Categorization of content (clusters)
- Occurrences (probabilistic)
- Duplication dampening (filters)
- Personalization (phrase based)
- Link analysis (inbound)
- Global site relevance
- Term proximity (for multi-term queries)
- Image tagging (in content segment/related terms)
Thoughts; what is important to learn here is that you need to build around core targets. If its on page or off (link building), you should always look to not only use key targets, but also keep your minds eye on related phrases and concepts that establish the theme around your targets terms. When doing keyword research, build out lists of semantically related phrases as well as the main terms being targeted in the program. Read more about problems with LSI and we also have stuff on Phrase Based IR
Behavioural or personalization signals
Yet another dicey area as were not altogether sure which signals are being used and what weight they get. What we do know is that, at this point, they arent a huge factor, but theyre on the rise.
- Search History
- Web history (pages/sites we visit)
- Query revision (and analysis)
- Search intent (informational, navigational)
- Explicit data (favourites, reader,wiki)
- Interaction with advertising
- Surfing frequency/ time of day
- Personalized PageRank (yahoo and google)
- SERP and document interactions
Thoughts; behavioural signals all come back to user engagement. Understand your demographic and ensure that youre tailoring content that will appeal to them. You should also craft page TITLEs and Meta-Descriptions for high CTR not just keywords. Also of importance (once again) are themes and concepts which can play into user categorizations; have a clear definition of concepts for your page. I would also consider researching common terms and revisions to maximize on query analysis that Google is currently fond of.
Considering much of this area is about (perceived) happy end users, its simply good business skills making an engaging site should always be the call of the day. At the same time one must weigh the potential (small) gain before focusing resources. Most recently, I've had a thing with Google's Personalized PageRank (although Yahoo looked at it as well).
Secondary factors affecting search rankings
Weve looked at the various potential ranking signals so far, but there are also ways one can damage ranking potential. These arent outright penalties as much as devaluation or dampening elements.
- Duplicate issues (structural/content)
- Link devaluations (segmentation, link text, recips)
- Poor architecture/coding
- Reviewer penalties
- Redundant meta-data (such as meta-descriptions)
- Canonical / URL issues
- Server reliability (can be de-indexed)
Thoughts; obviously we dont have the pace here to get into each of those; so do some homework if needed. Essentially we want to ensure that were not shooting ourselves in the foot with all of our positive work. Be sure to set yourself up for success by working with the conventional wisdom on the above areas.
Web Spam detection
If by design or by ignorance, getting whacked with a penalty can be devastating. Search engines have a variety of ways of detecting web spam and they can ALL lead to much more than a loss of rankings. Some methods include;
- Phrase based detection
- Domain history
- Query analysis
- Network proxy detection
- Link based (link spam and excessive recips)
Thoughts; unless youre playing with a throw-away domain, stay as far away from the edge as possible. Sure, we all slip up now and again; fortunately most times youd have to satisfy a few spam penalties to cause serious harm. Play safe
.. know where the boundaries are. For some reading try these posts on Link Spam and Spam detection via temproal factors.
Did you know there is even technology driven considerations in SEO? Search engines often deliver content differently depending on the device the user is on and the client theyre using.
- Client type (browser, mobile)
- Toolbars and browser (Google Suggest, web history)
- Application focus (email, instant messenger, RSS etc..)
Thoughts; one example is that Google Suggest acts differently when it detects a mobile device. Another maybe the amount of personalization data being collected dependant on browser and toolbar set up. These all have subtle indications of where ones search optimization tasks may need to evolve. Im just putting it out here so you know
. Ahight? - for some reading get into 'Application Focus Signals' -
And you aint heard nothing yet
Why? Because this is merely the generalized main index approach; we havent touched upon vertical/universal search. The verticals are the more specific engines such as;
- Video search
- Image search
- Blog Search
- Product search
- Map search
This allows for a whole new set of potential ranking functions and cross-pollination potential for you. All weve done is look at some of the ways search engines rank documents in the regular index
theres so much more. (I told ya SEO wasnt dead)
Once more, this wasnt meant to be some definitive list of ways to find ranking nirvana it is merely me musing. To be drop dead honest with you, I found myself learning along the way as I hadnt ever mapped out the various processes before. It was entertaining and cathartic. Next up? Maybe we can start to valuate them some
and break into on-site / off-site approaches. Maybe the verticals? Not sure.. so many directions to go... Until next time; play safe!
- All additions, comments or bitches shall be duly noted; do let me know what U think of the list -
In honour of those that manage to drag themselves all the way through some of these mind numbing search geeky posts, Ive made you a cool badge. Feel free to take this badge and display it proudly, print on a Frisbee (boomerang, flying disk or equivalent), you can put it on a hockey puck (eh)
. It doesnt matter really; I just had to have one.
Looking to take your SEO to a new level?