SEO Blog - Internet marketing news and views  

SEO Magic Bullet: 2010 Edition

Written by David Harry   
Wednesday, 05 May 2010 12:48

Link Assistant Scavenger HuntPhrase based WTF?

Often one finds themselves looking in the rear view mirror at topics that just won’t go away. One such shadow for me is a set of patents produced in whole, or in part, by Anna Patterson, (former Googler, was to Cuil for ‘em); the now (in)famous Phrase Based IR offerings from Google. I’ve lost count how many times I’ve discussed them/written about them over the years. It seems they just won’t seem to go away (we’ll get back to that shortly)

The other topic the keeps coming back? Well, that’s what is best known as; the SEO Magic Bullet

There are those that seem to believe in the magic bullet. Then, there are some sane people that passed on the tooth fairly long ago. For the purpose of today’s discussion, we’re going to go back to the 2007; The Magic Bullet - A chat with Bill Slawski

the SEO Magic Bullet - Bill Slawski

An email conversation with grand master Slawski, that turned into a post. The jist of the gig was that we shouldn’t treat patents/papers as gospel. Absorb them. Here’s some wisdom from that post;


The main benefit from looking at patents isn't necessarily seeing the methods that they describe, but rather being able to view the assumptions and the mindsets that they uncover.  We can be so absorbed in looking at things from the perspective of marketers, and make up our own folklore and mythology (sandbox, anyone?) that having this other perspective can be really helpful.”

Patent filings and white papers from search engineers don't necessarily provide a magic bullet, but they do provide the chance to look at information that comes directly from people working in search.  To ignore those documents means not taking advantage of publicly available information that gives us a glimpse what those search engines find valuable enough to protect as intellectual property.

There are trade secrets that will likely never be disclosed in patent applications.  And, the descriptions of processes in patent filings are only examples, and illustrations, that describe enough to protect the intellectual property behind the documents, while not disclosing enough so that they can be easily reverse engineered.” - Bill Slawski


Ok? Get the idea here or what? While it is a great exercise, learning about search engines, some perspective is required. Remember, this ain’t rocket science, it’s computer science.


Patent Pending

One thing that we need to remember is that when a patent hits the streets, that’s simply the award date. We can have a patent awarded today that was submitted back in 2004. Does that mean the search engine was waiting around and WHAM… started implementing it today? …erm… of course not. It has been a patent pending status.

This means it’s quite likely it was at least in some semblance of beta when the patent was written, implemented, morphed and evolved in the years that passed. On the other hand, it may never have been used, or used and abandoned as well. But, either way, they weren’t waiting around to start implementing the technology/methods.

Which all brings me back to my first redundant shadow; Phrase Based IR.

 

...Meh...

Was all one o’ me mates had to say when confronted with a recent spate of misconceptions I came across. One of the first technologies that captured my fire and forever geekified me, it was phrase based IR. Monsieur Slawski introduced us and it was love at first site.

What’s odd though, is it gets mentioned/rediscovered from time to time and people start to spark up as explanations for the oddest things. Witness;


I'm wondering if Google has made a change in their phrase-based indexing approach - something that the new Caffeine infrastructure makes feasible. Recently there has been more patent activity in that area.” – Google MAYDAY update - Ted via WMW


Hmmmm. Well, the patent in question was filed more than 3 years ago. We also know they had an interest way back in 2004. Obviously bringing the author, Anna, into the fold meant there was great interest. We can also note that in the later one, there were multiple authors (Anna was on the way to Cuil street?)

Would caffeine help? Sure, if as advertised, it is an infrastructure update. But that could be said for a lot of things (Open HTMM? PLSA? See? I can guess too...sigh). That’s not the point. What happens next is we see;


 “I'm still trying to get a handle on some of the odd fluctuations in site metrics attributed to what are undoubtedly bits and pieces of the Caffeine implementation. If you follow Google's patent activity, there's been some interesting recent activity in the area of phrase-based indexing.” – Dave Cosper via SEG


Awww…. Crap. See? This is how it happens. We’ve been down the LSI trail a time or two as well dontcha know. And who can forget the bounce rate fun? This is mis-reported and entirely improvable at the end of the day. But it does get around. But ok, a few posts, although in authority locales, but it’s not that bad…I mean, it’s not like there’s wide spread insanity over the phrase based stuff, right?


Phrase Based Information Retrieval

Dammit! Dammit! Dammit!DAMMIT!!!! Here we go again...


Slow down the ride! I wanna get off!

This, my weary web wanderer, is where the need to understand the magic bullet theory comes into play. These patents and papers are nothing more than insight. Even if we knew Google used it. Even if we knew there were know other signals. We’d still be lost as we don’t know the weights/thresholds/dampeners in place.

But alas, there are far more signals that we can’t account for in the mix. This makes isolation of any one signal next to impossible for mere mortals. Let us not do the chicken (little) dance, running about stating what Caffiene (an infrastructure thang big daddy) is being driven by. Nor blame the poor algo for wrecking Tom, Dick and Harry’s rankings. It’s grasping at straws. Ok? Thanks. I hope we don't have to do this again (however unlikely)

Oh, along the way, I also discovered what was surely the catalyst – that Bill guy again. Whaddya know. Well, at least we know that he doesn’t believe in magic bullets. Do you?

Until next time… play safe!


More reading

Here are some other PaIR posts not mentioned here for those interested in learning more;

Blog Posts;

 

Related Patents;

Phrase Identification in an Information Retrieval System,
Filed on Jul. 26, 2004;
Assigned; Jan 26 2006

Phrase-Based Generation of Document Descriptions,
Filed on Jul. 26, 2004;
Assigned; Jan 26 2006

Phrase-Based Searching in an Information Retrieval System,
Filed on Jul. 26, 2004;
Assigned; Feb 09 2006

Automatic Taxonomy Generation in Search Results Using Phrases,
Filed on Jul. 26, 2004;
Assigned; Sept 16 2008

Phrase-based indexing in an information retrieval system
Filed on Jul. 26, 2004;

Phrase-based personalization of searches in an information retrieval system
Filed July 26 2004
Assigned; August 25 2009


Don't be a lonely search geek!

SEO training community


 

Search the Site

SEO Training

Tools of the Trade

Banner
Banner
Banner

On Twitter

Follow me on Twitter

Site Designed by Verve Developments.