Finding relevance through User Performance Metrics
Imagine a world of Google Search where each person/computer received a different set of results? Or sets of query results that were fluid and more alive? No longer could the Search Optimizer simply use tried and true techniques to rank a given site, identifying probabilistic models, themes and demographics would become a talent to be learned. The SERPs (search engine results page) would be a more fluid environment and rankings would be a moving target. All of a sudden end user performance metrics (bounce rates, conversions, frequency) would play a role in ranking of documents to individual users or sets of user groups and (potentially) the main organic index as well.
Now dont go getting all excited just yet. These concepts are nothing new and you need look no further than Personalized Search aspects to get the idea of what such a world may look like. What is even more interesting are the implications such data could have on the over-all organic rankings and SEO targeting in general. You see, there are many ways to aggregate data other than a logged-in Google Account user; such as the Google Toolbar, IP addresses, cookies and more recently, the Google computer and Google Mobile (dubbed Android) possibilities. This proliferation of Google services embedded in such devices means there is even more access to conversion or performance data relating to natural search results.
Patent Bending 101
The crux of this piece centers on broadening your understanding of some of the aspects used in probabilistic models and user performance metrics. A while go, I had a look at 3 patents that Senor Slawski had passed along during our adventure into the murky waters of metrics and search; the summaries of which I have broken down for easier consumption in my Knowledge Base. Before we continue, be sure to understand the concepts of Patents and the SEO Magic Bullet. Also, while there are many paths along the journey into patent land, and one could write a book on any given filing, ultimately what we will be looking at are variety of approaches to retrieval, scoring and ranking of search results (and PPC placements as well) based upon user performance metrics, a look at how sessions and click data to help refine future queries, as well as touching on probabilistic scoring concepts. Sounds like fun huh?
User Performance Metrics
In essence, the actions you take when searching and surfing, can have an effect on how future results (and Ads) are served back to you. From query tightening for a single user session, to data to be used for future user queries across common searcher types, tracking of search result and Ad performance data can tell the search engine or Ad server what are the most probable (likely) sets of results to display in a given situation. The actions you take, say much about the relevance of the results.
The query results can be ranked and re-ranked depending on what the search engine believes you are looking for from concurrent or past searches. This can be addressed on a single user level (such as a logged in user or other tracking mechanisms) or on a larger scale of a large base of users and how they react.
- What are common documents that are selected?
- What are the bounce rates for selected documents?
- What are common user query revisions?
- Which user query revisions performed best?
Through using a variety of data relating to user performance metrics, the search engine tries to better define relevance in the results. For more on these factors read;
I think one of the important things to touch on is that much of this relates to using historical data, in the form or user sessions and performance data, to establish rules and methods for ranking and re-ranking results. The underlying theme is to try and achieve probabilistic models for more relevant results to serve the end user, in organic search results, as well ad serving (such as Google AdSense). How the methods are implemented or even if they are, we certainly cant ascertain. This phenomenon I describe as turning the dials, which means that depending where you set the thresholds for a given node, results and presentation can vary widely.
User data and relevance
What I find interesting is with a scoring layer such as this, you would have search results that would vary in ranking output depending on the user history (long or short term). What results are returned on one computer can be totally different in another as one can witness using Personalized Search aspects found when logged into a Google Account. We have always had to consider regional issues and variances through various data centers, but now such methods would introduce a more granular layer to the over-all fluidity of the SERP rankings. Also, conversion aspects would start to play a larger role in SEO. This would likely give Social Media experts a leg up as Titles and Snippets (and compelling content) would mean great click through and bounce rates. This theoretically would add to the over-all scoring of that particular document.
The data and ranking mechanisms can also be used to refine results for a single user, a set of user types or large scale user type data scenarios (user type grouping). It does not need to be used in a specific area such as personalized search for example. Larger data sets of end user data and performance metrics can also be used in relating relevance to the core system. The rankings of even the Regular Index organic results could ultimately be affected as far as ranking methods are concerned. Our friend Matt (Cutts) has rightly said that it would be a very weak signal but he certainly never said it wasnt used at all. How much valuation these types of methods would realize, would be based upon the thresholds set in the over-all ranking process.
To further illustrate some of the concepts surrounding performance metrics and probabilistic models, you can continue on from here to read the analysis of some patents that started me into this area (thank for the fun Mr. B). I actually posted these on over in my Knowledgebase as I try to keep the dry stuff off my Blog
(plus it gives me and excuse to send you over there:0). So here's some further reading;
The first patent analysis - Method and apparatus for learning a probabilistic generative model for text ; This deals with a method of teaching the system to look at user data in the form of user query sessions to try and understand the relevance of a given search. What people search, how they search (query refinements) and what results they choose can be used to teach the system what relevant types of queries/results have performed the best in past instances.
The second - Ranking documents based on large data sets ; This one is basically a follow up from the first one, but this time it starts to move into the how the system can score/rank the documents once identified by a systems (such as the one in the fist patent we explored).
The third Using concepts for Ad targeting ; This one also deals with using end user data (among other things) in an attempt to glean relevance as well as similarity scoring and ad placement/ranking. Also of interest in this one is using the performance data as far as what ads are getting conversions for a given search term/page topic.
Journey of a thousand miles
So there you have it
this is merely the first step. I hope that this post starts you at least thinking along the lines of User Engagement and how it can potentially affect your SEO efforts in the years to come. This is merely a starting point and I do encourage you to start getting your head around it as soon as possible. The end game of all of this isto merely be aware of the potential scoring factors and methods of collection that search engines have at their disposal. Abosrb it and move along.
If you missed it my recent foray into Personalized Search, contains some of the finer details on working with User Performance Metric concepts. Be sure to Check out; What Every SEO Should Know About Personalized Search
Until Next time
Related Patents; Predicting Ad Quality - Using estimated ad qualities for ad filtering, ranking and promotion - Estimating ad quality from observed user behavior -