The process of SEO frequently requires that you conduct periodic sweeps of the various platforms that make a website operational and either contribute to or impede search engine rankings.
Sweeps like this often include everything from assessing spider traps (where php can loop) and promote over-indexation, load time for server platforms, debugging blocked site segments, battling duplicate content issues or patching up inconsistencies in dynamic code due to performance based anomalies.
After sifting through Google Webmaster Tools a few days back (during an investigative sweep) we noticed there was a strange occurrence of duplicate content (the internal kind), creeping into our CMS (content management system) and wreaking havoc.
Fortunately, after delving in with our development team and virtually decompiling WordPress hooks and the core code looking for the source (obvious overkill to the solution); we found an interesting plugin to fix the (case of the wildcard category) problem called permalink validator by Rolf Kristensen.
The symptoms were that we were able to change the name of the category (such as SEO to SEO123) and the code would not generate a 404 error. Generally, this would not have created concern, however if you happen to have multiple plugins installed, often enough, they do not always play nice with each other.
Ideally, if you do find a critical conflict you can 1) backup your database 2) start disabling plugins one by one to determine and see if you are still experiencing the problem.
After going through this preliminary step, we still were able to change the categories (using wildcard/anything goes numbers, letters, etc.) and had to escalate the issue, looking deeper into the core code.
Another consideration is, if you are still using plugins that have not been updated for a period of time, core code updates to the WordPress can change, which can create bugs or multiple conflicts if left unchecked.
In this instance, the rewrite functionality was not toggling a 404 error, which left our site architecture open for interpretation and even worse, duplicate content. It is not uncommon that content management systems will duplicate posts or pages across multiple categories. Having the proper http status codes returned from a server header is critical for effective search engine optimization.
If 404 errors are not toggled, then it could facilitate spoofed urls or issues with categories, subfolders or pages through generating similar shingles (groups of words) across multiple pages. Search engines in turn have the tendency to deindex duplicate pages and discard them into a secondary / supplemental index.
If pages are supplemental, they are removed from the main search index and any significant rankings those pages garnered also disappear with those pages. For a 30 page website, this is of little concern, however, for a website with hundreds or thousands of pages, tags, feeds and aggregators, this type of disruption can be a costly setback.
Also, considering that WordPress often places the words /category/ then the category name by default, you should remedy that by using a plugin such as top level categories which is ideal if you are using the /%category%/%postname%/ custom option in your permalinks settings to theme and silo your content with a logical site structure.
After updating the plugins (and cleaning house on disabled plugins) something you should do, if they are not in use, the permalink validator wordpress plugin worked just fine in preventing the wildcard categories.
A great place to sniff out errors like these is in the html suggestions tab in your Google webmaster tools region. If you see duplicate meta tags and meta titles then you know the typical rewrite functions are malfunctioning.
You also want to make sure that all canonical issues are resolved within your website to resolve to a preferred URL structure either http:// or http://www. And then your domain name. A quick method to determine if your website is suffering from duplicate content in the index is to use a site:domain.com then use site:www.domain.com site:domain.com/ and if all three are returning the same number of indexed pages, then canonical settings are ideal.
Another must-have SEO plugin for WordPress is SEO Ultimate (created by John Lamansky & SEO Design Solutions) which currently performs over a dozen unique features in one stable plugin allowing you to do everything from rewriting titles, tags, categories, tracking 404 errors, reducing pagination issues and duplicate content with canonical tags and much more.
The moral of the story here is, the more moving parts you have, the more things that can possibly break under the wrong circumstances. Make sure you take the time to update your plugins and keep abreast of critical security patches for WordPress as they arise. As the saying goes, an ounce of prevention is worth a pound of the cure.
Help!!! lol…
As I mentioned before, I had a very old and defunct blog on wordpress.com that was still getting 3k uniques per month for some odd reason despite a PR drop from 4 to 2.
I used that content to start a new blog on a self-hosted wordpress site. Worried about the duplicate content penalty, I decided to redirect that blog from wordpress.com. They allow you to do this for $10 per year by using the domain mapping feature.
After doing this, now all of my overlapping posts that are redirected have been flagged by Google Webmaster as duplicates.
Oddly enough, I thought that I would get another 100 hits per day like the old site was doing, but my traffic has only gone up about 30 hits.
Any ideas on how I can solve the duplicate content flag on my redirected posts from wordpress.com?
Jarret:
301 redirects do take time to kick in (and resolve). Also:
1. Did you redirect page by page, folder by folder or wild card style all to the root?
2. Did you have significant link flow going to deep pages via deep links?
3. Were you coasting on the authority of the WordPress domain or did you have other quality inbound links?
4. Is the real domain (legacy domain) gone or does it resolve?
Things do take time, but Google is usually good at figuring this type of thing out. You may lose the legacy links (to a degree), but as long as you are active and publishing fresh content, you do not have to rely on past-performance to deliver relevant traffic now.
thanks for your sharing
Thanks for teh useful post. Many people jump into wordpress thinking that they do not need SEO if they use wordpress as a blog platform.. the sher fact that there are lots of plugins that aid SEO has fueled this myth… i used SEO ultimate for one of my client blogs and I must say I am impressed
Jefferey,
Thank you for your prompt reply.
1. I don’t quite know how the WordPress domain mapping feature specifically works since it’s a one-click option, but it does redirect to categories, pages, and posts.
2. I had linked many of the internal pages and posts with links on the old blog.
3. Yes, the old blog did have quality inbound links from authority sites. Lots of sidebar/blogroll links from relevant high PR blogs too. I know lately the value of sidebar/blogroll links have supposedly diminished. It also had quality backlinks to specific posts.
4. It’s a yearly renewal, so I do think that the legacy domain will resolve.
I guess it will likely take some time for the 301 redirects to sort out in Google.
Jarret:
I would sit it out and see if the PageRank and deep links resolve. The main thing is, if you take a hit in the SERPs long term.
Short term typically kicks up a few question marks, but if the old domain is still live, that could be a problem.
Let me know how things progress…
Sharing good health in your hands
duplicate content is my problems :D, thanks for useful posting, keep writing. btw your blog theme very cool… I like the color ;)
thank you…
Thanks for sharing this. This really helps. Duplicate Content is really a problem though.
great great post
Nice info thanks for the good post of about wordpress duplicate content..
Sure thing… Thank you for visiting.
I agree, I’d try waiting too. Try Ambid Update, they seems to be have an easy way to edit your website and they include some simple DIY tools for SEO, offering more than the simple stuff Google Webmaster tools offers.