Thanks to this interview, a plethora of looming misconceptions about SEO can be laid to rest or dismissed as conjecture as this is as clear as it is going to get from any official statement from Google regarding the inner workings of their search engine. This is not the interview, you can read that here, but rather our appreciation of the topics discussed.
Getting answers directly from the source was a great relief for many advanced SEO’s, who now have a baseline (for now) to use as a barometer regarding the way Google crawls, indexes and interprets pages, duplicate content and the underlying structure of a website.
By using some of the tips he suggested, it may provide insight to some problem areas which until now, may have been improperly diagnosed. There was so much to dig into for SEO’s as the tidbits pertain to a wide variety of conditions that must be anticipated, revised and optimized in a website in order to reap the rewards of a top ranking SERP (search engine result page) position.
They touched upon the crawl budget (whereby Matt dismissed the notion of an index cap) which suggests a website only has an allotted volume of potential pages indexed and he correlated the degree of importance to their web crawlers are proportionate to the amount of PageRank within a web site.
This means if you want more PageRank, then either (1) create more pages (more pages mean more PageRank) or (2) acquire quality links from sites replete with PageRank. More importantly, if you create more buoyancy in your website it as a result of your own contents internal dynamo and domain authority, make sure that your pages are not orphaned and have a sufficient amount of internal and/or external links to support them (like branches from the base of a tree), so they do not suffer from link attrition.
You will also note that although PageRank is a primary metric that indicates crawl depth, site authority or the aggregate trust and authority of a site can also allow PageRank to accumulate (despite there are no inbound links from other sites to individual pages).
We always suggest getting 5-10 inbound links minimum to the pages that have significance for ranking factor, even if they are only for the second level push (pages that link to primary pages). By doing this, it is effectively considered link equity and / or link insurance for that page when it ages, so it can accrue ranking factor and pass it along to your primary / more significant pages deemed worthy for commerce or conversion.
Under the same auspices, the amount of internal or external links (that pass value) change based upon the competitiveness of the keyword. Yet, ranking credit does apply to a site that is replete with authority (meaning PageRank can move around and strengthen pages further away from the root).
Also, Matt touched on duplicate content, the importance of site architecture, how to reduce canonical conflict and off page considerations such as the server environment and how that may impact indexation (how deep the Googlebot crawls) based on how many simultaneous connections and bandwidth is allotted to your web property.
It is not a matter of speed as much as it is bandwidth under this assumption. Page speed and loading time are factors, how they impact or if they impact rankings at this point are uncertain. However, if you are hosting your website on a shared server and have noticed a dip in pages indexed, you may want to consider a more aggressive package with more bandwidth.
As addendums to the topics they covered, we would like to touch upon:
- Proper Site Architecture & Page Rank Decay
- Duplicate Content and Canonical Issues
Proper Site Architecture and PageRank Decay: Funneling the appropriate link equity through site architecture is a basic premise in search engine optimization and utilizes the number of inbound and outbound links on each page to create a relative hierarchy of pages which represent the most commercial or prevalent value to users.
The homepage by default is the strongest page in a web site; however, based on internal links, how many links are leaving each page, deep links and whether or not pages are built into the primary navigation also play a role in this ordering of PageRank within a site.
Also, keep in mind (based on our testing) that as you move further away from the root folder, page rank tends to decay and it becomes harder to make pages nested in sub folders rank unless they are augmented by (a) more internal links (b) more citation from other websites via inbound links as a result of peer review.
With this in mind, you can use theming and siloing (creating a relevant hierarchy of themed / similar content in their respective folders with keyword-rich naming conventions) to elevate relevance and communicate to both users and search engines the subject matter of those pages.
However, in the event you want to create a prominent page as a catch all to consolidate that ranking factor, then you ideally should use flat site architecture where possible (using keyword-rich and inclusive naming conventions) to create the preferred landing page and extract the link equity from the themed pages through internal linking and co-occurrence.
You may have 100 pages of nested content (such as articles and blog posts deeper in the site) in our case, let’s use the keyword “conversion optimization” as an example, but you should have the conversion optimization landing page on the root.
As a testament to the tactic, search for conversion optimization services in Google, the ranking for our site was created in this fashion. The page itself only has 5 external inbound links, but the on page authority the page garners as a result of being at the root of the domain, with a few choice links from the proper area of the site is all it needs to maintain this position.
This is easily done through creating a structured format for link flow to move through the site. Ideally, Wikipedia is a prime example of virtual theming (which means they use lateral co-occurrence of keywords to cross-link individual keywords and pages).
You can use virtual theming, theming and siloing or deep links and flat site architecture in tandem to dial in a number of specific rankings (based on their monetary value to you).
If you note the destination of any link on any page in Wikipedia, you will also note that any time that keyword occurs on any other page in the site (on a page that is indexed and has ranking factor) passes the link flow back to the primary landing page most appropriate for that keyword.
You can implement the same type of landing pages with flat site architecture (meaning they are either in the root or close to it) so that the page is no more than 3 clicks away from the homepage or easily accessible from the primary navigation. Any link built from the home page (or pages in the root) and/or built into the primary navigation typically carries more weight to how search engines interpret this hierarchy vs. pages further away.
He further explains “A good example might be to start with your ten best selling products, put those on the front page, and then on those product pages you could have links to your next ten or hundred best selling products. Each product could have ten links, and each of those ten links could point to ten other products that are selling relatively well.” This is SEO at its finest and is a perfect instance of doing things properly in the first place, vs. trying to patch a leaky bucket of fix sub optimal site architecture.
In their interview, they really covered allot of topics aside from the ones we rehashed above.
Matt also touched on:
- Using robots.txt to block folders that have ads or affiliate offers, so they don’t get crawled.
- 301 redirects and how there is a slight decay and how the total link flow may not reach the new destination page.
- PDF files and how they are crawled as pages and much much more.
Here is a quick SEO tip regarding PDF files, always use no-follow on PDF links or make sure there are links back to your main pages, otherwise they can leech PageRank and link flow.
In closing, here is the link to the interview once again and thanks for both Eric and Matt for covering these topics in great detail.