Managing crawl rates and ensuring your website has enough internal links or sitemaps to get search engine spiders deep into the sites architecture is a critical component of search engine optimization. In order for pages in your website to rank or in this case even matter to a search engine, it needs to be present in the cache as a result of being indexed.
Countless horror stories occur when pages fall out of the index. This can happen from (a) the wrong use of the .htaccess file which may have inadvertently blocked spiders from crawling the page (b) changes to navigation or (c) changes to file name and / or naming conventions which result in broken links.
Although rankings typically resume when the broken links are fixed and re-crawled, or when the .htaccess file is restored, it can be a harrowing experience for any business.
There are certain protocols you should remember about sitemaps, if you do not have a cron (script) that monitors the addition or removal of new pages in your website and automatically updates the sitemap, then, when you make changes to a site, make sure to upload a fresh copy.
Google webmaster tools allows you to track how frequently your sitemaps are crawled, so this is your first line of defense to ensure that your pages get in and stay in the index.
Otherwise, one way to expedite crawl rates is to acquire links from sites with a higher crawl frequency. This means any website capable of having a do-follow link that spiders will traverse will be suitable in igniting a fresh crawl.
The larger the site the less frequent the crawl for pages further away from the root folder. Particularly in regard to sites with similar elements such as shopping carts or pages that do not have enough content to truly stand on their own as a genuinely unique contribution.
Using a tiered approach if you have hundreds of pages tucked away in subfolders works well. This means that you simply have a master sitemap that links to subordinate sitemaps. So, with a site with 10 sub folders, each folder should have its own sitemap; than all 10 should be linked from the main sitemap.
That way, when one is crawled, the others are included in that crawl and any changes you may have made to any of the pages all make their way into the search engine result pages.
If you find your website has become unruly and has too many sub folders, then flattening site architecture is another solution that can remedy deficiencies in link flow and revitalize languishing pages.
For example, if you have a dynamic site, you could simply copy the contents of a page and reposition it. If the page were domain.com/products/electronics/dvd-players/black-sony-model234.html you could create a static page close to the root and flatten the site / subfolders by consolidating the naming conventions found in the entire url string.
The revised page could be domain.com/electronic-products-black-sony-dvd-player-model1234.html which puts that page in the original root folder (which inherently has more link flow that other areas of the site).
As a result, that page could hold its own, enjoy a more frequent crawl rate (granted it is included in sitemaps, internal links or navigation) but the takeaway is, it is easier for user agents a.k.a search engine spiders that crawl the web to locate, update and index and changes to the file.
The last step would be to 301 redirect the old file location domain.com/electronic-products-black-sony-dvd-player-model1234.html to the new location domain.com/electronic-products-black-sony-dvd-player-model1234.html which can often be accomplished with a shopping cart setting.
After that, you still need to get spiders back to the page since files that far away have a longer crawl cycle, so by either building links to the old location or the new location, you will toggle the positive SEO effects of the pages recent transition.