"I was thinking about you all day today and what a great
person you are."
"I wanted to be #1...After 2 months I reached the top position for my most popular keywords."
Above The Fold!
How Important is PageRank?
The primary driving factor in Google's relevancy algorithm is getting many keyword rich links from many websites hosted on different servers.
When PageRank first came out it was a large part of Google's ranking algorithm, but over time it has been greatly depreciated. As much as anything else PageRank is a marketing tool used to help make Google synonymous with search.
What is PageRank?
Google PageRank is a ranking system which ranks individual documents based on their empirical placement in the web. PageRank is the estimation that a random web surfer would cross a page.
When a page links to your page it is viewed as an independent vote of quality for your page by that person. Google is on a continual rolling update of the web, ranking the entire web in this manner. Many inbound links on are to home pages. The closer your page is to the home page (fewer links away), the more important that document must be to your web site.
If you have a great idea within your site it is possible that the page which that great idea is on will have greater link popularity (and thus more PageRank) than your home page.
Where did PageRank Come From?
The PageRank ranking system was new to the internet when they introduced it, but PageRank uses an off the web idea from long ago. One of the measures of the importance of a research paper or thesis was how many other research papers referenced it.
Ex many hypertext research papers reference As We May Think by Vannevar Bush. My link just voted for As We May Think in a similar way to how so many thesis papers in the past have. (Vannevar Bush created the concept of the Memex - a fundamental step toward a hyperlinked society).
Why does PageRank Work?
PageRank is an effective way for measuring page popularity for two main reasons...
- For general queries this usually helps return home pages.
- Links are one of the quickest and most accurate feedback loops on the web.
How does PageRank work?
PageRank is passed through links. Each page only has so much PageRank which it can pass to other web documents. A dampening factor (approximately 85%) is set to allow the current pages PageRank to propagate through the rest of the web, while preventing all we pages from being ranked at a 10 or a 0.
The off going PageRank is split up between all links on that page (including links to other pages within that site). Therefore a PR 5 link on a page with 10 links is worth way more than a couple PR 5 link on a page with 100 links. One PR 8 link may be worth more than a thousand PR 3 links.
The value of any given link is hard to determine exactly. Generally though less links on a page means that more of that pages PageRank will be parsed out to your site. Also the higher the PageRank of a page the more PageRank it can pass on to other pages.
A page does not lose PageRank by linking out to other sites (unless it links out to penalized sites), it just has less PageRank to share amongst the other documents it is linking to.
What is the PageRank calculation?
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
d= dampening factor (~0.85)
c = number of links on the page
PR(T1)/C(T1) = PageRank of page 1 divided by the total number of links on page 1, (transferred PageRank)
In text: for any given page A the PageRank PR(A) is equal to the sum of the parsed partial PageRank given from each page pointing at it multiplied by the dampening factor plus one minus the dampening factor.
What scale does PageRank use?
PageRank uses a logarithmic scale. This means that it is harder to go form PR 6 to PR 7 than it is to go from PR 3 to PR 4. This is why there are tons of PR2, PR3, and PR4 sites, but few PR 8, PR 9, or PR 10 sites. Only a few sites have a PR 10.
Problems with PageRank
While PageRank works well in a controlled academic environment it falls apart on the commercial web.
As with any democratic system there are problems with the PageRank system. PageRank assumes each link is a valid vote for a web site. Some links are not really valid links at all. People use guest books to spam, and they are becoming less effective. A newer technique of PageRank spam is to leave a bogus comment in a blog with a highly optimized link. People have also began to sell links with high PageRank <--- this is huge now. These are all blatant PageRank manipulation techniques.
Blogs and PageRank:
In addition blogs contain a ton of links interacting back and forth. Many argue that this is really destroying the PageRank system, but I disagree. Many of the top blogs are from book authors and other highly opinionated resources. These blogs are linked to a ton of times BECAUSE they have good information in them... that is the whole point of PageRank. In addition, blogs are rarely optimized for search engine rankings the way many other sites are...The usual goal of most blogs is just to express an opinion.
PageRank and New Sites:
Another problem with PageRank is that it presents a bias against new web sites. Currently Google gives new sites a boost to make up for this, but mixing the new web pages in is very hard to do. Obviously the Google algorithm errors on to be at least slightly biased toward PageRank, otherwise they would undermine their entire ranking system.
Solutions to PageRank Problems
While some fields (such as internet marketing) may be over represented in the web as a whole usually the best ideas in any field are the ones which gain high link popularity. Where PageRank really falls apart is when people rent or sell links across fields.
Slowing the Sale of PageRank:
As far as selling PageRank goes Google has aimed to fight this off by:
- only showing some backlinks;
- blocking some sites from passing PageRank; and
- showing old or inaccurate PageRank. Google may start updating PageRank quarterly now.
Personalization, Semantics, & Clustering:
To fight the cross topic link selling and manipulation Google may eventually use semantics and clustering to theme similar pages together. Using semantic technologies does not require Google to exactly understand the value or idea of a page, but just to take a rough approximation of the value of it.
This will allow faster computation of PageRank and lessen the ability of sites to manipulate the system.
Some of the Technologies Google may use:
Stopping Blog Spam:
As far as the spam links being stopped ... the spammers are annoying a ton of bloggers with the spam. MovableType already created a comment spam redirect. Jay Allen created MT blacklist, and a hack is on the market which closes out comments after a set period of time.
PageRank is not one of the most important parts of Google's relevancy algorithm but is one of Google's most important marketing elements.
A single high PageRank link can give you a high PageRank. The primary driver for the Google relevancy algorithm is getting many keyword rich inbound links from many locations.
Check Your PageRank
- Download the Google Toolbar
- Google Directory (shows PageRank on a different scale than the toolbar. Show PageRank on a 1 - 8 type scale and only lists sites which are in the ODP.)
- View PageRank Without the Toolbar
- PageRank for Macs - free PageRank display tool by Digital Point
- PageRank Explained - great PDF by Chris Ridings
- PageRank: a True Commodity - an article by me :)
- Handy Dandy Google PageRank Figuring Guide
- Efactory PageRank Review
- Phil Craven & his PageRank Calculator
- Ian Rogers Pagerank Explanation
SEO Tools Involved with PageRank
- SEO Guy - Offers PageRank by Search Query. You set the PageRank level and make a query. It will return pages which have a PageRank of that level or more.
- OptiLink Link Analysis Software - organizes backlinks in order of PageRank.
- Digital Point Keyword Ranking Tool - tool uses Google's API to track backlinks and PageRank.
- RustyBricks Link Analysis Tool - tool uses Google's API to track backlinks and PageRank.
- PageRank Citation Ranking - Bringing order to the web
- Taher H. Haveliwala - Includes topic specific PageRank, Scaleable Techniques for Clustering the web, and many others