Ok.. Bill Slawski put a shout out for some back up with a couple of patents relating to PageRank.. I had a few minutes to spare and decided to jump in...... Apparently Google put up 2 patents on PageRank and Ol Bill was wondering if there were any major differences -- so why not?
See Bill's original Post for more - New Stanford PageRank Patent
Documents - Patents of discussion; Original Patent - Follow Up Patent -
Let's see what we can find...... (read on)
I guess we can simply start from the beginning, Oui? Right away the Abstract wording has been modified from;
A method is presented for scoring documents stored in a network. The method includes identifying links from linking documents to linked documents in the network and determining an importance of the identified links. The method further includes weighting the identified links based on the determined importance and scoring the linked documents based on the weighted links.
A method assigns importance ranks to nodes in a linked database, such as any database of documents containing citations, the world wide web or any other hypermedia database. The rank assigned to a document is calculated from the ranks of documents citing it. In addition, the rank of a document is calculated from a constant representing the probability that a browser through the database will randomly jump to the document.
Nothing terribly exciting though the concept of the probability that a browser through the database will randomly jump to the documentis an interesting addition.
The only difference in the Cited References is the addition of the earlier patent to the list.
The Claims section was all but bare in the original patent.. With only one that states;
A computer implemented method for scoring documents, at least some of the documents containing links to other ones of the documents, the method comprising: determining a probability that a searcher will access each of the documents after following a number of the links; and scoring each of the documents based on the determined probability.
The second patent has 14 points that more closely define the parameters. Its pretty standard fair though, such as;
the importance rank of each of the backlinked web page documents is weighted in dependence upon the total number of links in the backlinked web page document
There are a few references to probability that seem to support the changes from the original Abstract Changes ;
wherein the matrix A is chosen so that an importance rank of a web page document is calculated, in part, from a constant .alpha. representing the probability that a surfer will randomly jump to the web page document.
Outside of that nothing really jumps out at me.
Background of the invention -
Early on nothing was changed except this;
The well known idea of citation counting is a simple method for determining the importance of a document by counting its number of citations, or backlinks. The citation rank r(A) of a document which has n backlink pages is simply
Was changed to
The well known idea of citation counting is a simple method for determining the importance of a document by counting its number of citations, or backlinks. The citation rank r(A) of a document which has n backlink pages is simply r(A)=n.
Seems merely like an omission that was corrected.
There are considerable changes in this area of the patent. The new additions are mostly computational additions furthering what was added in the Claims section. From what I can tell they are looking to protect the patent with a tighter definition.
This paragraph was removed in the second incarnation;
One aspect of the present invention is directed to taking advantage of the linked structure of a database to assign a rank to each document in the database, where the document rank is a measure of the importance of a document. Rather than determining relevance only from the intrinsic content of a document, or from the anchor text of backlinks to the document, a method consistent with the invention determines importance from the extrinsic relationships between documents. Intuitively, a document should be important (regardless of its content) if it is highly cited by other documents. Not all citations, however, are necessarily of equal significance. A citation from an important document is more important than a citation from a relatively unimportant document. Thus, the importance of a page, and hence the rank assigned to it, should depend not just on the number of citations it has, but on the importance of the citing documents as well. This implies a recursive definition of rank: the rank of a document is a function of the ranks of the documents which cite it. The ranks of documents may be calculated by an iterative procedure on a linked database.
There is a small addition to the probability aspect with;
In addition, the importance rank of a node is calculated, in part, from a constant .alpha. representing the probability that a surfer will randomly jump to the node. The importance rank of a node can also be calculated, in part, from a measure of distances between the node and backlink nodes of the node. The initial N-dimensional vector p.sub.0 may be selected to represent a uniform probability distribution, or a non-uniform probability distribution which gives weight to a predetermined set of nodes.
Take from that what you will, nothing is jumping out at me really.
Detailed Description - Once again not much has really changed here except for the 4th paragraph which has been modified with some of the new computational models introduced. This seems to be a common theme at this point. Where the original;
where B.sub.1, . . . , B.sub.n are the backlink pages of A, r(B.sub.1), . . . , r(B.sub.n) are their ranks, .vertline.B.sub.1.vertline., . . . , .vertline.B.sub.n.vertline. are their numbers of forward links, and .alpha. is a constant in the interval [0,1], and N is the total number of pages in the web.
..type of model has been replaced with
function..alpha..alpha..times..function..function. ##EQU00001## where B.sub.1, . . . , B.sub.n are the backlink pages of A, r(B.sub.1), . . . , r(B.sub.n) are their ranks, |B.sub.1|, . . . , |B.sub.n| are their numbers of forward links, and .alpha. is a constant in the interval [0,1], and N is the total number of pages in the web.
After that there are some more minor computational additions/modifications, but nothing earth shaking that I can see.
This was also removed from the original version;
The present method of determining the rank of a document can also be used to enhance the display of documents. In particular, each link in a document can be annotated with an icon, text, or other indicator of the rank of the document that each link points to. Anyone viewing the document can then easily see the relative importance of various links in the document.
..and thats all folks.
So, in the end I cant see this being more than a tightening up of the definitions and additions of some minor edits. I dont see any major changes to the original patent that would warrant much attention from where I am sitting.
Happy Trails Bill
.. All is as it was in Google Land.