SEO Blog - Internet marketing news and views  

Real time search engines; should SEOs care?

Written by David Harry   
Thursday, 02 July 2009 11:41

The hype and the reality

(Update: more rambling on the topic my latest post on real time social search)

One of the more popular buzz words in search over the last while is ‘real time search’. For starters, that’s a bit of a misnomer; there are NO actual ‘real time’ engines… that’s simply not possible (even Twitter search updates in intervals not entirely ‘real time’). Regardless, people keep getting worked up about this particular area to the point of wondering how it will affect SEO efforts.

Last week the folks at Media Post contacted me for some quotable quips for a piece on real-time search and (One Riot’s) PulseRank. They wanted to know if SEOs were considering this the ‘wave of the future’ or something new to the lexicon.

For their part, One Riot has said, "We believe PulseRank will replace PageRank over time for the real-time Web. The reasons are clear. PageRank is based on the number of links to a page or a specific URL builds over time, as people link to pages. It provides the searcher with the "authoritative answer” -- These are some bold statements and yet another in a long line of self professed ‘Google Killers’ – but is there credence? We’re going to look at the world of real time search and see if there really is anything to be looking at (as SEOs)… care to come along?

What is real time search?

Ok, for starters I often see the commonly known ‘real time’ search engines are simply seeking out social mentions not really crawling the web and indexing it. This, to me, is the first part of the problem… are they really search engines? Or simply buzz monitoring tools?

Real-time search is one of the more difficult areas for search engineers to deal with. Much of the problem lies in establishing the most authoritative answers without being bogged down by spam. Dealing with web spam requires a great many signals that are hard to come by in 'real time'. At the end of the day if 'real time' search was an effective approach Google and others would (likely) be doing it already.

There are more than a few problems associated with real time search including;

  • Spam – this would be the most problematic area for any search engine. It would be nearly impossible to combat web spam in a large scale environment. This is why most of the major engines haven’t moved beyond ‘almost real time’ (such as Google’s ‘query deserves freshness’ approach.
  • Ranking – as with the QDF, there needs to be some type of evaluation of quality and authority to make ranking of documents effective. Some of the above engines use domain and (social) user authority, but this approach does kill some of the democratic nature of the web and the rich get richer. What about lesser known domains and new users/content?
  • Social dependencies – most of these real time engines are reliant on social signals which really does limit the actual abilities as a search engine. They are NOT indexing pages that aren’t getting social luvin’. And despite popular belief, not ALL content on the web is social worthy and this makes such search engines limited in scope.

If we consider the above, there is a lot of work to be done if such real time search approaches are to ever be of value. But

Rubber meets the road

In the testing we ran so far there is no real sense to the ranking mechanisms. In looking at some of the more general queries (in this case; ‘PageRank’) it seems that merely being the most recent citation is all that matters. Sadly, the post itself is devoid of any real content relating to PageRank and thus shouldn’t really be ranking. But it did… see the problem here?

One Riot - This is the application with the vaunted ‘PulseRank’ which claims to be a superior method from Google’s PageRank. Our early testing showed it to be generally inferior to Collecta, but in this test it did OK comparatively.

Test time; 49 Minutes - this was only AFTER it was Tweeted by someone that had my blog RSS hooked up to a Twitter account. What is interesting is this was the first one to list the actual blog post, not the Tweets
.One Riot test results

Scoopler - This one also tends to rely on Twitter, but it does pull out the most popular links which does list the actual blog post. This is a nice feature and of the social search engines, this one would be my fav….

Test time; 48 minutes; as with the above, this is obviously not REAL TIME and really only a social mention aggregator - though the actual post links are a nice touch.

Scoopler

Whos Talking - For the record, my man Joe has never claimed to be a real-time search engine or even a search engine for that matter. But considering many of these apps tend to be more buzz monitoring than search/indexing, I decided to include it into the mix.

Test Time; 50 minutes – and once again, the Twitter feeds are what it has picked up.

Whos Talkin

Crowd Eye - This is a nothing more than a Twitter monitoring tool and unless you have content Tweeted, it won’t show up. Thus, once more, this is not really a traditional search engine and is nothing more than a limited buzz monitoring tool. And really, it’s nothing more than a pretty Twitter search… soooo…. FAIL.

Test Time; 56 Minutes - and it was the Twitter account with my feed in it. Furthermore, it only picked up 1 of the 3 Tweets for the post.

Crowd Eye

Collecta - Test Time; 49 minutes - This is another one that seems to be dominated by Twitter results more than anything else, although there are some blog results as well (for the record they claim; ‘We draw from the web at large - not just social networks.’). While this may be true, it would seem much of the ‘real time’ aspects relate to social mentions.

Collecta

Google – Test Time; 3hrs 15 minutes - Of the big 3 only Google managed to actually get the post indexed. And while it wasn’t as fast as the social engines, I’d still say that 3hrs is pretty good considering that there are more involved ranking methods and better quality of results. Obviously more popular sites that are crawled more frequently would be even faster.
Google
Google Blog Search – Test Time; 15hrs - Strangely, the blog search took considerably longer to index the page. Once more though, the sorting/ranking abilities make this a more usable search engine. I am going to do some testing with more popular blogs to see how short a time frame there can be with this one.
Google Blog Search
Yahoo – Test Time; 19hrs - Not only was Yahoo slower, but it never did get the right page and was picking up the title from my side bar links not the actual page…. Have to give this one a FAIL
Yahoo

Bing – Test Time; 23hrs - Much like Yahoo, not only did they take a fair length of time to index, but it also got the wrong page by picking up side bar links. Also a FAIL

Bing

What is clearly obvious is that the current state of RT engines are wired for social. What that means is there is likely a great deal of content on the web that isn't getting a social mention that wouldn't get picked up by them (unlike a traditional engine such as Google). This is a serious problem and makes the case for a buzz tool over an actual search engine that crawls the web and makes decisions on levels of indexation for a given site/page. We also did some minor testing with pages on sites not getting social traction and as you may imagine, the RT engines failed miserably.

 

Nothing more than social regurgitation

What is really happening with the current stock of RT engines? For the most part these are nothing more than social mention regurgitation more than any type of formal crawling/indexation. That is certainly NOT what a search engine is or does…

And that's what most of this so-called real time search is essentially. Not as much about real time indexation and ranking as they are about buzz monitoring or glorified Twitter search applications.


What ARE the big three doing?

They seem to be looking at social structures as sphere's of influence. Google has a few patents on a system that has been dubbed 'FriendRank' and 'InfluencerRank' (though there is no wording as such in the patents) which could hint at a social structure and ranking system. Read more here - As is Microsoft with a related system (here)

Now, these approaches are primarily for Ad targeting and recommendation engines, but it does seem to show more of the direction that search engines may use in concert with traditional search approaches to develop some type of 'social search' aspect. This does seem a logical approach ultimately by integrating social signals into the existing approaches. Sure, it's more social search than real-time, but it does help solve some of the problems we outlined earlier.



The verdict

Real time search is still something that hasn't been effectively conquered from a technical and critical mass vantage point. As such, it shouldn't be a serious consideration for any SEO beyond the potential for buzz monitoring. In many ways these are barely what we would even call a search engine traditionally. If the big players are to ever broach this realm, much work would need to be done (especially with web spam).

Verdict? Should SEOs be concerned with real time search optimization? Not at this point… It's a passing fancy and until a serious engine (that drives traffic) embarks on a real-time adventure, SEOs are best to watch from the wings. And to those that tout Twitter search as a RT search engine, I submit that is is nothing more than a site search... not really a search engine.

 

More reading;

Real time search off - TechCrunch

Twitter’s real-time spam problem - Search Engine Land

Race is on for best real-time search engine - Seattle PI

Who rules real time search? -  Venture Beat

Bing Keeps Its Foot On The Gas, Adds Tweets To Results - TechCrunch

 

Comments  

 
0 # TheMadHat 2009-07-02 12:30
No.

Seriously though, with all the testing you do, when do you actually work?
Reply | Reply with quote | Quote
 
 
0 # Dave 2009-07-02 12:48
hehe... well bro it's simple - I cloned my ass!!

Actually many times I take things to the Dojo community and others help out. In reality there are far more things I'd like to test, but time constraints to kill most of them.

This particular one I'd been ranting about to peeps for a while now since 'Twitter the search engine' concepts were first floated. Generally speaking a 'search engine' seeks out content and then decides if it should be indexed/ranked. Most of these RT engines simply regurgitate info and are quite prone to manipulation. Not really what I'd call a search engine.... 'real time' is just another catch phrase peeps are latching onto...
Reply | Reply with quote | Quote
 
 
0 # John Carcutt 2009-07-02 13:10
"And to those that tout Twitter search as a RT search engine, I submit that is is nothing more than a site search... not really a search engine."

Right On Brother!
Reply | Reply with quote | Quote
 
 
0 # Dave 2009-07-02 13:30
thanks John, I hear U R working on a RT Search post... am def interested in reading that.

A search engine, to me, denotes a system of 'discovery/indexation/ranking' decisions... many of the current set of RT engines really aren't that complicated... never mind spam detection etc....

Google currently does index major sites fairly fast and even deal with ranking aspects via QDF approaches. Once they start to add more social signals (such as noted above) they might get closer... but 'real time' is a problematic issue for sure...
Reply | Reply with quote | Quote
 
 
0 # Greg Martin 2009-07-02 15:42
Enjoyed your post and your test results. Using the standard search engines for social media results gives users more of the same linear results just filtered by time. Real solutions are coming from semantic search engines built with algorithms able to process natural language. The only place I know where you can get results capturing individual and group sentiment, opinions, and experiences from content of various sorts, including messages on Twitter, is TipTop (http://feeltiptop.com/). You get more synthesized, fresh, and applicable results via http://feeltiptop.com/real%20time%20search/, with the exception of queries that people are not talking about (archived data). I'd love to hear your take on TipTop results.
Reply | Reply with quote | Quote
 
 
+1 # Dave 2009-07-02 17:39
Well, much of what I was getting at really is that many of these aren't really search engines to me... they're more buzz monitoring applications..

They tend to be massively based upon social signals... and not much else. They also tend not to do well with ranking and spam...

Feel Tip Top seems to be ok, but is also not what I'd use for transactional and other searches (tried looking for Pizza locally and others, it kinda failed). It did do well for buzz stuff tho... and therein lies the problem.

I think the market should be 'social search' not RT as it really doesn't describe it well. If one took traditional crawling/indexing/ranking methods, added a level of the social profiling I mentioned and THEN a layer of social discovery, and social voting, you might be getting closer to what the future of social search holds.
Reply | Reply with quote | Quote
 
 
0 # Dudibob 2009-07-03 03:34
Sweet post Dave and you've pretty much hit the nail on the head. Real time search is more of a gimmick as very rarely do people need real time info (Micheal Jackson showed why it would be useful but how often does that happen?).

I'd much rather find a tutorial for something that has stood the test of time than some new tutorial that's probably rubbish. Also with RT search, would an older (but probs better) article be punished for being 'old'? That's why RT won't work
Reply | Reply with quote | Quote
 
 
0 # Dave 2009-07-03 11:54
..yea, does seem to be a bit of a curiosity more than a serious search/research tool.

After penning the post I played with some traditional searches (informational/transactional) it while minty fresh, it's not the greatest way to navigate the web IMO. I don't see great value in any of the current crop really...

Maybe one day when a large scale engine includes explicit user feedback with social signals (layered on standard index) we will see a truly social search implementation...
Reply | Reply with quote | Quote
 
 
0 # Terry Van Horne 2009-07-04 12:14
Spam and the fact it's just regurgitated social nonsense makes real time search just another topic that people use to fill the WoDS (Waste o' Disk Space) that sadly make up 90% of the SEO blogosphere. The fact you called it right and got beyond the hype that is propagated by the WoDs puts you one up on them. I don't always agree with you but... I do respect your candor and ability to see things generally for what they are and could be in this case you're bang on. Only a fool believes that spam won't be all over real time search like flies to... simply because they could literally overwhelm it... that IMO, is not fixable unless you can change human nature... one person at a time!
Reply | Reply with quote | Quote
 
 
0 # Dave 2009-07-04 16:29
Hey there Terry thanks for dropping in. Certainly was nice that we're seeing eye-to-eye on this one... wonders never cease!!

This one is much like implicit user feedback signals (behavioral) in that spam would be a serious problem in any type of major implementation - that's for sure. The reason none of the current crop aren't being heavily smacked is that they aren't referring much traffic and thus aren't worth the effort... good lord it would be a mess if a major engine tried it....

.. definitely a WoDs candidate at this point...
Reply | Reply with quote | Quote
 

Add comment


Security code
Refresh

Search the Site

SEO Training

Tools of the Trade

Banner
Banner
Banner

On Twitter

Follow me on Twitter

Site Designed by Verve Developments.