Duplicate Content and SEO

With SEO, sometimes you hear about duplicate content and duplicate content issues. These “issues” are real and unless corrected, it is like splitting hairs and losing a large percentage of your potential SEO ranking factor within a website.

How To Avoid Duplicate Content

What are some of the types of duplicate content? And what can be done to manage, prevent or minimize their impact on rankings or site performance?

Canonical http:// or http://www variations of pages.
Shopping carts using hand me down descriptions from manufacturers.
Pages that use the same template with very little distinction from eachother,
Block level shingles (content copied from page to page or from other websites).

As you can determine from the brief list above, there are multiple species of duplicate content and each should be eliminated or managed from the level of your content management system, .htaccess file or through the uniqueness of content on each page.

Today we are only addressing the type created within your own website (as a result of content and indexing from lack of enough content) and ways to avoid or eliminate it.

We often emphasize the importance of uniqueness on a page by page level within a website. Having 100 pages and 800 of them having a 12% difference as a result of a different title, model#, photo and brief description doesn’t qualify as enough distinction significantly.

Common Duplicate Content Solutions

There are a couple of different things you can do to eliminate duplicate content, refine canonical issues or control which pages are indexed and which others are not.

1) Write enough content to shift the collective focus of the page.

2) Personalize data feeds or commonly used databases for makes and model#’s from mfg. for ecommerce products.

3) Add a custom field or segment on your ecommerce pages to create their own digital footprint.

4) Consider using meta-tags, or robots.txt to block or prevent indexing of extremely similar pages (which alleviates the penalty).

5) Use a canonical tag so that pages that are a linked like a daisy chain (or bread crumbs) all link back to the leader as the primary index page. This is very useful for deep categories where getting to the main category page is sufficient from an SEO perspective.

6) Personalize footer links to prevent diffusing the pages topic sitewide through “making each page similar”.

From the perspective of search engines, realize that if a page has nothing to contribute, if it already has 20 other pages from your website that have the same look, feel, code and context, you can’t expect them to index and waste space within their primary index on cookie-cutter pages.

To briefly address the points above:

For page to page level canonical issues use a 301 redirect by making adjustments to your .htaccess file (granted you are running a Linux server). More information on 301 redirects and how to implement them here from Taming the Beast.

Once fixed, depending on the type of site level duplicity, your link flow will consolidate within a site where you can then use internal linking to shore up segments of your website that may be depleted or starved for sufficient link flow.

Also, make sure if you have multiple versions of your homepage to consolidate them using the page level redirect method to one specific version http://www.yoursite.com/ versus http://www.yoursite.com/index.html or http://www.yoursite.com/default.htm |.aspx |.shtml | .cfm, etc.

Also, when linking to those pages (both within your own navigation and from other sites), make sure you use one consistent page / naming convention / URL.

Solutions for Multiple Pages with Similar Content

There are a few ways to remedy this. You can use a canonical tag to refer back to the original page or manage what gets indexed to avoid duplication.

If you have access to personalize the headers of the page, then you can implement a simple meta tag.

NOINDEX, FOLLOW – invites the spiders / user agents to crawl your page, but prevents that page from being indexed. Yet search engines will follow the links and include them for ranking factor or discover new pages based on the links present on the page.

This would be the preferred selection for ecommerce pages, if you wanted to emphasize the link back to the main category page and apply your SEO efforts to make the category your preferred landing page for rankings and traffic from natural search.

INDEX, NOFOLLOW – Tell search engine spiders / user agents to index the page, but not follow the links on the page to other pages or count them in their algorithm.

NOINDEX, NOFOLLOW – informs them, that you gave at the office and you would rather not be bothered. In other words, NOINDEX, NOFOLLOW means that the page will not be included in the search engine index and the links are not followed or treated with any significance.

The most common meta tag setting is INDEX, FOLLOW which is like leaving a plate of cookies out for Santa, it is the best way to ensure that permissions are granted and welcome them in to crawl, index, sort and earmark pages in their index for information retrieval. The same could be done at the server level with a robots.txt file as well. For more information on robots.txt follow the link.

If you have dynamically loading pages that insert tracking cookies, session ids, or other unique strings into the URL of your pages, then using the canonical tag is the best way to ensure your pages are not creating duplicity across your entire content management system.

A canonical tag is a tag that the three major search engines (Google, Yahoo and Bing) recently started to support that eliminates clutter in their index by not allowing multiple variations with slight differences to appear as more than one page.

By adding the canonical tag, it ensures that much like highlander, there can be only one, and the main page gets the credit, rankings and authority. For more information on canonical tags and how to implement them, follow the link.

Other Methods to Eliminate Duplicate Content

Adding unique content for pages that requires more emphasis, this includes rewriting your top selling products if you are using a shopping cart to avoid duplicate shingles from other sites selling the same products.

At first, you think, who has the time to rewrite hundreds or thousands of products? The answer, the companies that know the value of unique content, authority and what it means to rank heads and shoulders above the competition with a fraction of the backlinks or the SEO required to propel them there.

Every word you add to a document, it changes the landscape of that page, the more unique it is and the more frequently it references the primary keywords the page targets, the easier it is to separate that page from other pages in the template.

Navigation is also another area of concern for nested pages as they can cripple your near duplicate pages by convoluting the page with numerous (me too) links, like the other pages and completely diffuse the UNIQUENESS of the page.

You can tuck navigation behind an IFRAME, use of nofollow, or redirecting the navigation links from behind a /CGI-BIN/ script where all of the redirects happen in a page not followed by robots or blocked by robots.txt.

However, despite if they are followed or not, the presence of all of those links creates a unique page impression that for lack of better terms skews your s.k.u’s (stock kept units) by slightly obscuring their focus.

The advantage to the CGI-BIN method is, you could then insert includes or use contextual links (links in the body copy of the document) to connect relevant pages using keyword-rich anchor text to increase relevance and ranking position.

Footer links or breadcrumbs would then carry more weight in the algorithm since every link is not WIDE OPEN on your pages. Although that falls into the category of page sculpting, it is still important to understand the premise behind the topic.

Another Advanced SEO Tactic

1) Create a custom text field somewhere on the page through the content management system.

2) Populate that content through an XML feed or pull data from a blog. The blog could be set to noindex, follow and link to the new destination landing page and the landing page showcases the feed data as its own.

3) Or for a low tech solution, simply write 250-400 words on the topic and insert that into the custom text field.

As you can see, where there is a will, there is a way. The blog method mentioned allows you to easily implement a CMS, keep the content from showing up in search engines (on the blog itself) as you can set it to archive or noindex, then the aggregated data, which you created can be utilized to rank on the page of your choice.

You could also tag the content, and aggregate it through additional means like Google Base and other RSS based feeds for additional leverage.

The real take away here is, if you want to distinguish your website in search engines, be ready to take an extra step and implement solutions that support granular changes to url naming conventions, on page content, link structure and CMS architecture. All of these changes are secondary to thoughtful planning and can be eliminated entirely from building it properly in the first place.

However, since that is not always an option, having the ability to implement distinct modifications and functions to produce optimal ranking factor means creating custom programs, hacks and workarounds to adapt, modify and optimize your CMS, add content on the fly, change or modify header information, rewrite titles and more.

In the event that a redesign is not applicable, there are plugins that can help to distinguish your money pages through using some of the tactics we discussed above.

Also, did I mention it was free? If you use the WordPress platform, then we invite you to download our SEO Ultimate Plugin which has the ability to rewrite titles, meta data, tag and archive pages, adjust headers, track 404 errors, alter the robots.txt or .htaccess file, toggle which modules are active and numerous other SEO features.

As stated above, it’s better to build it right the first time, but just in case you can’t there are always solutions for common SEO problems.

Read More Related Posts

SEO or Web 2.0 Internet Marketing?

Which is more beneficial for business, SEO or Web 2.0 Internet Marketing? First, the distinction; SEO is all about making your own web property more visible based on keywords and key ...

Introducing the Ask an SEO Help Section from SEO Design Solutions

New “Ask an SEO” Help Section from SEO Design Solutions

A few weeks back we asked for feedback on our writing style, and specifically how we can help our readers, albeit SEO professionals or click-and-brick small business owners, ...

Embracing SEO

Search engine optimization also known as SEO embraces multiple facets. Each facet is holistically integral to the collective aggregate that comprises a website. Through a plethora of tactics, techniques, timing ...

SEO Requires Planning and Patience

Search Engine Optimization (SEO) is all about layering processes over time. Often we receive calls from prospects who (a) want to rank for the most competitive keywords and (b) do ...

Proceed to Checkout – Understanding Search Behavior

Business is all about exposure, value and availability which is the premise of SEO. When the need arises for consumers to find a product or service like your own, it ...

Google Search Engine Optimization Tips

Search engine optimization involves creating an optimal set of circumstances, metrics and conditions to produce viable shifts for specific keywords in search engines. Google is the most sophisticated search engine ...

SEO, Link Clusters, Age and IP Diversity

Just like getting in on the ground floor of any opportunity makes perfect sense, when it comes to link building and SEO, things are no different. In fact, there is ...

Seasonal SEO

There’s no question some businesses rely solely or heavily on seasonality and holidays to generate a majority of their revenues. The holiday season surrounding December proves to pay large dividends ...

SEO and Big Brands…Do they ALL need it?

Let’s say you represent a BIG brand; one that is a household name. Next, let’s say that your products or services are not e-commerce based. Want an example? Sure, let’s ...

How to Dethrone Competitors using SEMRush, Compete, SpyFu and Alexa.

SEO Tips to Find Competitors’ Keywords

Looking for a quick way to data mine competitors keywords to determine if they are worth the hassle to optimize? After reading this quick SEO tip and using a few ...