Skip page content
 

Google’s New Patent

Google has a new patent application pending. The application should give insight into the direction Google is headed in terms of ranking sites. Whether Google has implemented many of the changes described by the application is unknown, but the fact that the company is taking the time to submit the application indicates that it either intends to or already has implemented many of the ideas the application describes.

Historical Data for Ranking

The patent describes using historical data for ranking. This data includes: document inception dates; document content updates and changes; search query analysis; link-based criteria; anchor text content and fluctuations; traffic; user behavior; domain information; ranking history; user maintained/generated data; unique words, bigrams, and phrases in anchor text; linkage of independent peers; and document topics.

The document is focused on explaining how these factors could be used to find spamming domains, and how they could be used to determine the freshness or staleness of a page and the freshness or staleness of the criteria Google uses to rank the page.

Freshness and Staleness

On some topics, older sources are preferred over newer ones, fresher information is preferred over stale information, or stale information is preferred over fresher information. Google’s application describes methods for determining whether a document is new or old, fresh or stale.

Fresher documents may show an upward trend in the number of back links in short periods of time, have links from other fresh documents, contain a large percentage of fresh links, or can be viewed for the same amount of time by users for similar queries over time.

The freshness of a document is partially determined by the freshness of anchor text and back-links, and in turn, the relevance of links and anchor tags to a document is determined by their freshness relative to the document. If anchor text changes when a document goes through a major update, the anchor text of other links may be considered stale, as the focus of the page seems to have changed. Pages that have stale anchor text may have links from certain dates discounted, so that the change in the page’s topic would be reflected in the search engine results.

The main concern here is to keep results current and accurate, to identify patterns in search relevance over time, and to keep up with the topics in changing web sites. Simply stated, Google is making a concerted effort to provide the most relevant results as Web sites and user interests fluctuate over time.

Finding Spammers

Google apparently plans on using, or is using, historical data to identify spammers. If a Web page shows a spikey rate of growth in back links, shows up in Google for a large amount of queries relevant to different topics, acquires back links early on at a fast rate, jumps suddenly in the results, shows spikey growth for a topic that is not mentioned in recent news or user groups, shows a spikey growth in back link anchor text words/phrases/bigrams, or has a large growth in links from independent peers, the page may be considered spam. If a domain is registered for no longer than a year before it expires, has false address information or information that changes frequently in the DNS record, changes hosts or name servers frequently, or is associated with an unreliable name server, the domain may be considered a spam domain.

Discussion

A use of historical data in determining rankings would help Google thwart a common SEO technique used to get rankings called “Google bombing.” Google bombing is getting a large amount of links with anchor text containing a keyword phrase. This would get a site ranked well for that phrase.
By checking the history of linking and the history of link text, Google would be able to find spamming domains – domains that get links in large quantities with SEO engineered anchor text at very little editorial discretion. The problem for Google would be sorting out topical phenomenon (legitimate events that may make a site relevant in a relatively short instant in time), as pages covering these events would show the same pattern in linking as spamming domains (presumably). This is part of the reason Google talks about freshness in its patent.

SEOs may consider more gradual linking campaigns that show more variation in anchor text. However, if Google is able to realize the goals of its program, it will prevent domains from blasting through the ranks without earning more legitimate credibility from other sites Google considers important. It should be noted that link exchanges (with unrelated pages), link purchases that are designed to give pages better rankings, and free-for-all linking are being targeted. Exactly what types of link purchases are being considered spam isn’t clear, because the primary function of almost all links is to get pages better rankings because of Google.

Much of what will continue to be effective in Google will be determined by the details of the patent application’s implementation.

Subscribe to our blog

Never miss another post. Enter your email address and subscribe:

If you use Google Apps, please log out before subscribing.