Shhhhhh be very quiet, we’re hunting information retrievers
Let's bounce off in a new direction, begin anew as it where and look at what resides in the craniums that index the world’s information. I would like to introduce to you a concept, (much like page segmentation),that isn’t a new one. It has been in front you all this time, but like a black hatter at dollar domain bazzar, you were to busy to notice.
As the regular Trail riders would know, we’ve gone from extreme interest in behavioural metrics and personalized search to more tempered views of potential usage. Yes, it’s true, I seem to have a bit of a personality disorder and waffling seems the call of the day…. to the uninitiated that is.

You see, it is the regular index, where implicit feedback signals seem more difficult to grasp than tumbleweed in a tornado. Time and time again this wayward web wanderer has mused that it is far more likely these signals could find value within personalized search than out in the wild.
And what do we know from our counterparts from Google? We know they’ve called them noisy and spammable; which some research done here gives credence to the claim. We also know the mantra for 09 (and recent years past) has been ‘personalization’. OK, this makes some sense and maybe worth delving into deeper? But where do we look?
Time travelling SEO style
Let’s take a journey back in time… waaaay back, in tech years at least, to 2003. At the time the little engine that could purchased a company named Kaltix, (interestingly a few months after the Applied Semantics purchase – of LSI fame).
At the time it was but a 3 month old operation put together by a few Stanford geeks that Larry Page (of PageRank fame lol) noted as, “working on a number of compelling search technologies, and Google is the ideal vehicle for the continued development of these advancements” – of particular interest was that they were “developing personalized and context-sensitive search technologies”. (Google’s press release).
Now, you see, Kaltix was headed up by a uber-smart fellow named Sepandar Kamvar (Sep). By no small coincidence Sep is now Google’s go-to guy (technical lead) in the personalized search department and works with iGoogle home pages as well. That all makes this Gypsy curious indeed… here’s a snippet from their Stanford PageRank Project ;
“Ideally, each user should be able to define his own notion of importance for each individual query. While in principle a personalized version of the PageRank algorithm can achieve this task, its naive implementation requires computing resources far beyond the realm of feasibility. In the past couple of years, we have developed algorithms and techniques towards the goal of scalable, online personalized web search. Our focus is on the efficient computation of personalized variants of PageRank.” - Standford PageRank Project

Enter Personalized PageRank
With this in hand we now look a little deeper and sure enough Sep had worked on more than a few papers including this one on his personal site; An Analytical Comparison of Approaches to Personalizing PageRank
This is an interesting read that looks into the problems associated with a pure Personalized PageRank as far as trying to calculate a more granular flavour of PR. As you can imagine calculating a personalized PageRank over the billions of people on the web is a massive and resource heavy endeavour that simply isn’t feasible in large scale implementations such as Google. Thus toying with the give and take of quality to functionality is the call of the day it would seem.
While I am sure things have evolved, let’s look at some of the approaches they talk about to help deliver acceptable quality while not over burdening the resource pool.
- Topic Sensitive PageRank; while not a direct personalization, it can be use to adapt rankings based on query topics and context. It would be calculated ahead of time and adapted at the time of a search using context elements of a given query.
- Modular PageRank: this aspect essentially restricts the random walk to more authoritative/trusted documents. So in a round-about way, not so random a journey ultimately.
Going beyond the work at Stanford, he also has worked on a patent a while back on personalizing anchor text scores in a search engine – (filed May 2004 and assigned August 2007) – which you can find well covered by Bill as well as my old chum Aaron (aka the MadHat).
This patent deals with a variety of link analysis factors including using user profiles, (User information database) and personalization (Page importance ranking). Dependant on the query and related user profile information, various documents in the results set can be given a boost (or demoted accordingly). Other factors may include past searches and selections from the users search history. It also discusses more weight being given based on anchor texts (link analysis), which weren’t really dealt with in traditional PageRank.
And Sep has also worked on the following patents;
Methods for ranking nodes in large directed graphs – filed August 2003 assigned May 2007
Adaptive computation of ranking – filed August 2004 and assigned April 2006
Query boosting based on classification – filed Nov. 2004 and assigned Oct. 2008
It’s not just a Google thing
While this is an interesting trail to follow, it isn’t actually limited to Google (although Google do seem to take the lead in personalization). The folks at Yahoo have also looked at personalized PageRank as noted in this patent; User Sensitive PageRank (filed June 2006 and assigned Jan.2008). The authors of that one also worked on Anchor-based Proximity Measures (PDF) – 2007; which discusses both Personalized PageRank (PPR) and Harmonic Rank (HR - I prefer HarmonyRank). This particular flavour also utilizes behavioural data into the mix;
“The present invention relates to techniques for computing authority of documents on the World Wide Web and, in particular, to techniques for taking user behaviour into account when computing PageRank.”
As with the Google offerings demographic (user profiles), behavioural and location data can be used in calculating a more granular user sensitive PagRank. They also discuss blocks as we noted earlier with some of Sep’s research and patent filing endeavours. The Yahoo papers even include a temporal factor (somewhat like Google’s – query deserves freshness). As with the Google approach, they also discuss anchor text scoring as added layers of relevance.
Why am I telling you this?
You see my wayward SEO web wanderers, given all the interest in behavioural metrics and Google’s pimping of personalized search, this is definitely a path worth travelling. As I have mentioned many times, implicit user feedback makes far more sense in a personalized setting and so we must begin to look in this direction. Let the journey begin…
Considering what we know about the fascination with personalization and behavioural data, we might be best served by understanding more about how search engines are going about such systems. This is an important area of interest for SEOs in 2009 and a far better use of pixels than debating the use of singular implicit aspects such as bounce rates – I’m just sayin’
Stay tuned as next time out we’ll look at interesting tidbits relating to personalized search, how it works and potential data collection points..
To start your own journey also read;
Personalization gurus Sep Kamvar and Marissa Mayer – SearchEngineLand
Interview with Sep Kamvar – StoneTemple
|
Comments
I should be getting back to this topic next week at some point...
Thanks for dropping in tho, nice to know at least a few folks were taking a ride into the history aspects... :silly:
I just wanted to say you've been officially added to my RSS feed. B)
I bookmarked your 'SEO higher learning' post but I won't have much time to read it all but I will be making time for it over the next few months.
I learned a lot of SEO through the gurus blogs (Moz, Boykin, Wall, Ward, etc) and honestly, none of them are covering these new dramatic changes. I love them all but they are dropping the ball on new SEO developments.
Google isn't going to tell us their inner-workings, nor is Yahoo/MSN. Their newer patients are going to make things more complicated than decent content + links = ranking.
Keep up the great work. People are listening. :cheer: It's not our job to get people to learn, it's our job to speak up and let our voice be heard.
Continue to blog about these things. It's helpful to many of us.
he he.. seriously tho, it's not a matter of IF with this one - more a matter of what flavour and how PPR has changed since the Stanford days. Google has talked about the Kaltix purchase and that Sep's work on speeding up PageRank is responsible for their being able to speed up the process of serving results. We put that in context with Sep being the technical lead for personalized search and well… as they say… this isn’t rocket science.
As far as ‘cutting edge’ – this isn’t really new, much like the page segmentation stuff we talked about on the Trail recently – this one is more about connecting the dots. I shall be making some follow up posts on personalized search and PPR in the coming weeks – glad to have U along for the ride
@DB – hey dude, I haven’t seen U around in a while, hope things are well.
As I mentioned above one way or another it’s certainly alive and well at Google, and potentially Yahoo as well.
We did some research into personalized re-rankings last fall and published a paper on it – we’re looking to do even more detailed work in this area in the coming weeks. While SERPs do vary, it’s not as massive as we’d first thought they’d be.
I’m glad you enjoyed it as I believe the history and connecting the dots is AS important if not more so, than actually getting actionable information about personalized pagerank concepts.
Drop by more often bro… good to C U again!!
Do you really think computer scientists sit around year after year with their thumbs up their asses? Or they are somehow beneath would-be SEOs that spew ignorance?
If you've got some research that shows rankings aren't a metric of value, I'd be thrilled to report on it - if not, I submit that those on both sides of the 'rankings are dead' debate are nothing more than blowhards that enjoy the sound of their own verbiage.
Yes, behavioral metrics are certainly noisy and problematic, but to bring down a mounds of IR research into the topic as a glorified PR move or a smoke screen is once again ignorant IMO. Direct Hit??? Do have any idea how long ago that was in IR and computer science terms?? The times they are a changing my friend. You've even written about topic sensitive PR approaches with are well aligned with G's personalized search work.
Next time, skip the adolescent name calling and publish some research that makes your case - this industry is growing beyond the old school prognosticators and into a world of mutual research and understanding.
Unfortunately Terry, every time I see you out an about of late, you are tossing barbs and otherwise proving discourteous. It is unfortunate that civil, tempered, discussions are elusive. Being a fellow Canadian, it does bring me sorrow.
None-the-less, thanks for stopping by and all the best to you and yours - And spunk up would U? :woohoo:
My article on topic sensitive PR was in response to the Florida update not personalization and beyond the author working on both I see no relationship between personalized search and topic sensitive PR. I may be wrong but I haven't spent a lot of time researching Personalized Search because I don't see it as a big factor because I have never, and will never, use rankings as a metric to measure SEO success. My main point is that it is only in play when people with a Google account search with the feature turned on. So... how many searches is it really affecting. The fact Google has been Mum on that Number makes me believe it is not a big number or worthy of all the noise about it killing SEO!
My point about Direct Hit was that they used many of those factors and were gamed big time. IMO, they were gamed by a small industry now that industry is much larger and there are a lot less negative repurcussions for spammers. In the old days spammers were ridden out of Dodge on a rail... now... they are touted as blackhats and reveered by many.
I can provide no research that proves rankings have no value, however, valuable metrics can be used to improve conversion, usability and host of other things whereas rankings don't tell you anything beyond your position in a SERP. It tells you there is a problem but provides nothing of value to act on. Most valuable metrics will provide that in spades. You took my point out of context because I was saying using rankings as a measure of SEO success was idiotic not the metric itself.
As to my use of the word idiot... would you prefer knowledge challenged? Personally, political correctness is a waste of time and energy. I'll call a spade a spade and live with the consequences. Only those I call idiots seem to care. I think I may have inferred that in a post on SEL. I was pissed because you called your Google API SEOPros Directory. Obviously you are not an idiot trying to poach traffic which was why I was pissed. For that I extend a real heartfelt apology. I was wrong and I will take responsibility for that.
Rankings - I do agree that they aren't a KPI in SEO, to me at least, but they do provide anecdotal evidence when one is looking to see which terms convert better. We tend to measure search traffic levels and related conversions, thus I understand where you're coming from.
Personalization /behavioral - I have been one to beat back the hype over these recently as there is little evidence that they can or are being used by search engines at this point. Personalization also isn't as big a deal as some have touted as there is generally only 1-2 added results to a given PS result (which in themselves require a user to search related topics more than once).
Interestingly there is still ranking flux for those not even signed in to G accounts which might be application focus or simply DC syncro issues... The next round of testing we have a base of 70 or so peeps (all US for DC limiting). There are some curious anomalies, but nothing worth changing SEO practices for at this point.
Anyhoo, I have talked to Jeff about ye a few times and he assures me that, while excitable like yours truly here, yer a good egg - Maybe next time he drops into the Big Smoke to see you, I might tag along.
As for 'idiots', while I may use that term to describe some folks in this biz, I have decided to be a better community member in '09 and lay off the barbs. Not because it's politically correct, but because it is the right thing to do and does little but make me look one of them when I spew venom :whistle:
I really like your style of writing. I read a lot of blogs about blogging, SEO and affiliate marketing, but i find most of them to be fairly dry and technical. Enjoyed yours very much.
humor+useful info= a delightful post
-Caroline
BTW the comments below are icing on the cake
RSS feed for comments to this post