Cleaning out my closet part II
Ok my festive search geeks and freaks – more patents. This time out I have rounded up some of the patents in my notes that never made it to publication here on the trail. If you missed it, be sure to also check out Part I – (Google search patents 2008)
Of interest is their ongoing interest in social and even a version of page segmentation.
Once more, it cannot be stressed enough, that following search patents should be part of any fastidious SEOs ongoing education. They help not only understand search engines, but can give a glimpse of potential processes in the future.
Now let’s get on with the show
General Indexing and Retrieval
When categories are assigned to pieces of information, a search can be focused based on the categories. In an online forum, information is categorized by topic, and a search can be focused on the topic by adding additional search terms or restrictions to a search query, where the additional search terms or restrictions are based upon the categories. The restrictions may restrict the search to a particular web site that is determined based upon the category. In an online forum for answering questions, where the questions are categorized by topic, information related to a question may be located by performing a Web search for search terms extracted from the question. The search can be focused on relevant web sites restricting the search to sites that are related to the question's category. The results of the search may be displayed as related links alongside the question in the online forum.
System and method for detecting a web page (page segmentation)
An improved system and method is provided for detecting a web page template. A web page template detector may be provided for performing page-level template detection on a web page. In general, the web page template classifier may be trained using automatically generated training data, and then the web page template classifier may be applied to web pages to identify web page templates. A web page template may be detected by classifying segments of a web page as template structures, by assigning classification scores to the segments of the web page classified as template structures, and then by smoothing the classification scores assigned to the segments of the web page. Generalized isotonic regression may be applied for smoothing scores associated with the nodes of a hierarchy by minimizing an optimization function using dynamic programming.
System for generating query suggestions by integrating valuable query suggestions with experimental query suggestions using a network of users and advertisers. (4 in the series)
A system is described for generating query suggestions by integrating valuable query suggestions with experimental query suggestions using a network of users and advertisers. The system may include a memory, an interface, and a processor. The memory may store a historical dataset, a plurality of query suggestions, a plurality of query suggestion values, a query exploit set, a query explore set, and a data describing a network. The processor may identify the plurality of query suggestions in the historical dataset and generate data describing the network based on the historical dataset. The processor may calculate the query suggestion value for each query suggestion and may rank the query suggestions based on the query suggestion values. The processor may generate an exploit set comprising the top ranked query suggestions and an explore set comprising the remainder. The processor may suggest the query suggestions in the exploit set and the explore set.
Regression framework for learning ranking functions using relative preferences
Web search engines typically employ a ranking function to determine the relevance of the search results. Thus, ranking functions are at the core of search engines and they directly influence the relevance of the search results and users' search experience. Many models and methods for designing ranking functions have been proposed, including vector space models, probabilistic models and language modeling-based methodologies. In particular, using machine learning to determine ranking functions has attracted much interest.
Automatic generation of taxonomies for categorizing queries and search query processing using taxonomies
Embodiments of the present invention provide systems and methods for processing search requests, including analyzing received queries in order to provide a more sophisticated understanding of the information being sought.
Implicit name searching
Techniques and tools described herein provide mechanisms for displaying information that is contextually related to a search query. Using these techniques and tools, a user can lookup and discover a person or other entity from contextually related information. For example, if the user submits a search query on the title of a song (e.g., "Janie's got a gun"), then, in addition to a variety of documents related to the title of the song, the user may be presented with information about a related entity such as "Aerosmith" (e.g., the band that sings the song). In this way, the techniques and tools provide mechanisms that identify information that is not directly related to the search query, but that is information the user may find useful or interesting based on context of the search query.
It’s all semantics
System and method for determining semantically related terms using an active learning framework
Systems and methods for determining semantically related terms using an active learning framework such as Transductive Experimental Design are disclosed. Generally, to enhance a keyword suggestion tool, an active learning module trains a model to predict whether a term is relevant to a user. The model is then used to present the user with terms that have been determined to be relevant based on the model so that an online advertisement service provider may more efficiently provide a user with terms that are semantically related to a seed set.
System and method for determining semantically related terms
Systems and methods for determining semantically related terms are disclosed. Generally, a semantically related term tool receives a seed set and identifies a plurality of terms that constitute the seed set. For each term of the seed set, the semantically related term tool identifies concept terms associated with terms of the seed set other than the term being processed, joins the term being processed with each of the identified concept terms, and adds the resulting terms to a plurality of semantically related terms. The semantically related term tool removes invalid terms from the plurality of semantically related terms based on a language model and ranks at least a portion of the remaining terms of the plurality of semantically related terms based on a metric indicating a degree of semantical relationship between a term of the plurality of semantically related terms and one or more terms of the set seed.
System and method for revising natural language parse tree
An improved system and method for revising natural language parse trees is provided. A revision dependency parser may learn a set of transformation rules that may be applied to dependency parse trees generated by a base parser for revising the dependency parse trees. A corpus of natural language sentences and a set of correct dependency parse trees may be used to train a revision dependency parser to correct dependency parse trees generated by the base parser. A revision engine may compare the dependency parse trees produced by the base parser with the correct ones present in the training data to produce an observation-rule pair for each dependency. A rule may specify a transformation on the predicted dependency parse tree generated by the base parser to replace an incorrect dependency with a corrected dependency or may change the type of dependency expressed for the grammatical function of the dependent word.
Enabling searching of user ratings and reviews using user profile location and social networks
A system and method are directed towards a free-form search query of user reviews using user profile, location information, and/or social networks, to obtain a result having an associated universal aggregated rating. The user may enter in free-form a search query that may then be transparently modified using the user's profile, social network, and/or current physical location. The search results may then be presented to the user along with aggregated weighted ratings. The user may also enter products and/or services into a data store, including comments, and a universal rating. In one embodiment, the user may provide a tag to another reviewer's comments that may be useable to aggregate ratings. In one embodiment, the user's profile, location, and/or social networking information may be used to further annotate the user's inputs.
Hot in my communities (My Blog Log?)
Embodiments of the invention are directed to identifying network resources or other topics that are of interest to members of multiple online communities to which a user belongs. Online communities include blogs, websites, games, e-commerce systems, messaging systems, wikis, etc. For each online community, click activity or other client behaviors are tracked and analyzed to determine statistical metrics about community activity, such as which articles, links, services, or other network resources are popular in the online community. At least some of the tracking or analysis can be performed by clients that access the online communities, by a server of each online community, and/or by a central tracking system. The results for each community may be further analyzed relative to each other. The results are provided for all communities with which a given user is associated. For example, a list of the most popular links in the user's selected online communities.
Hot with my readers (MyBlogLog?)
Embodiments of the invention are directed to identifying topics that are of interest to users belonging to a selected online community, across multiple online communities visited by the users. Online communities include blogs, websites, wikis, etc. For each online community, click activity or other client behaviors are tracked and analyzed to determine statistical metrics about community activity, such as which articles, links, services, or other network resources are popular in the online community. At least some of the analysis can be performed by clients that access the online communities, by a server of each online community, and/or by a central tracking system. The results for each community may be further analyzed relative to each other. The results are filtered for the selected community and provided for the selected community. For example, a list of the most popular links for all users belonging to the authored community may be provided.
Social networking for mobile devices
A mobile device, system, and method are directed towards enabling an integrated display of live views. The integrated live views are generated by employing social networking information, including moods of a person, avatars, status of a member's activities including whether they are in an IM session, or the like. Integrated live views may include a live contact list, a group view, a friend view, an activity oriented view, a list of content, or the like, based on the mobile user's social networking information. By providing the mobile user with integrated live views of their social network, the mobile user may be able communicate with other members within the mobile social networking context, to obtain, and respond to invites from a social network member, provide opportunities for activities to other members, to grow their social network, and to consume content that is displayed relative to their social network.
Contextual mobile local search based on Social Network vitality information
A system, apparatus, and method are directed to managing contextual based mobile searches. A context oriented user interface interprets inputs from a mobile user based on vitality information. In one embodiment, the input may be interpreted as a request to perform a context-based search over a network using at least some of the vitality information. Vitality information may include a location of the mobile device, a time of day, an event, information from the mobile user's calendar, past behavior of the mobile user, weather, social networking data, aggregate behaviors, or even information about proximity of a social contact. By employing vitality information to perform a mobile search, better search results and a richer user experience may be provided that includes a sense of community, a sense of presence (e.g., a sense of "here-ness."). In one embodiment, the mobile user may provide comments to others regarding the search results.
Search pogosticking benchmarks
Disclosed are apparatus and methods for quantifying how much searchers select other search results, instead of a particular search result. In example embodiments, the number of times that other search results are selected before a particular search result is selected (referred to as pre-pogosticking) is tracked, and the number of times that other search results are selected after a particular search result is selected (referred to as post-pogosticking) is also tracked. This pogosticking information may be used to improve search result ranking as produced by a search algorithm or to provide metrics to potential or current buyers of particular search terms.
System for providing geographically relevant content to a search query with local intent
A system and method are disclosed for utilizing local intent to provide geographically relevant information in response to a search query. The search query results and advertisements may be chosen based at least in part on the local intent and geographic range of the search query. The search query may be assigned a location identifier based on the local intent that is used to expand the geographic range for ranking and selecting relevant content and advertisements.
System for determining local intent in a search query
A system and method are disclosed for determining local intent. Local intent may reflect whether a search query should receive results and advertisements that are geographically specific. The local intent may be determined using probabilistic models that analyze historical searches to determine which search terms tend to have local intent.
Real-time search term popularity determination, by search origin geographic location
Information is generated indicative of frequency of search terms presented to at least one online search service. As event indications, indicative of user interaction generally with front end servers, are being provided for persistent storage, ones of the event indications that are indicative of search events are detected. The detected ones of the search event indications are processed and it is determined, based at least in part thereon, by location, frequency data indicative of a frequency of each of a plurality of search terms presented to the at least one online search service. An indication of at least some of the frequency data is caused to be associated with indications of locations to which the frequency data corresponds. For example, the frequency data may be displayed superimposed on a map.
Discovering and determining characteristics of network proxies
A device, system, and method are directed towards determining network information. A network address is determined for a possible proxy. A determination is made whether a port on the possible proxy is open and/or if the port supports an HyperText Transfer Protocol (HTTP) proxy request. A request is sent to the possible proxy over the port, the request being configured to be forwarded to a network device. A type of the possible proxy is determined based in part on a behavior of the network device. The behavior may indicate whether the request is received by the network device, or whether the possible proxy obscures an origin of the request. The proxy type may include whether the possible proxy is a non-proxy, an anonymous-proxy, a controlled-proxy, and/or an open-proxy. Various types of network analysis may then be performed using the possible proxy and the determined proxy type.
And that's it for my Yahoo leftovers. Next time we'll look at some of the patents the fine folks at Microsoft had to offer this year that we didn't look at.
(also see our list of 1st quarter search patents)