magento duplicate content seoMagento is an open source ecommerce platform used by an ever increasing number of our clients. However, one particular weakness we tend to find with Magento sites is the way the platform creates duplicate pages through the creation of multiple URLs that display the same content.

Now, despite popular belief, you aren’t going to be actively penalised by Google for instances of duplicate content on your otherwise functional site.  However, duplicate content does affect the way that search engines crawl and index websites, and can therefore hinder a site’s performance in organic search.

SEO Implications of Duplicate Content

Put simply, websites with duplicate pages will not perform as well as they could in the search results for searches related to the content of the duplicate pages. Duplicate pages can confuse search engines as to which version should be displayed in their search results, leading to one (or sometimes both) pages being filtered out of Google’s index.

Here’s Google’s take on duplicate content, from ex head of web spam, Matt Cutts;

“Duplicate content issues are rarely related to a penalty. It is more about Google knowing which page they should rank and which page they should not. Google doesn’t want to show the same content to searchers for the same query; they do like to diversify the results to their searchers”.

While there are exceptions to the rule, your best chance of ranking well in Google’s results is to ensure that you don’t duplicate pages on your site. Instead, you should aim to have a single page for each topic, featuring unique content written specifically around that topic.

Common Causes of Duplicate Content in Magento

Multiple product page URLs

By default, Magento configures product page URLs  depending on which categories the product has been added to. Thus, if a product has been added to multiple categories, it means the same product could also be accessed from multiple URLs, depending on which categories it has been added to.

For example, a single product could be accessed from the following URLs:

  • www.example.com/category1/product-1
  • www.example.com/category2/product-1
  • www.example.com/product-1 < this is referred to as the ‘top-level’ URL in Magento

The issue here is that Google has access to three URLs that contain the exact same content. Because each of these URLs displays the same content, this can be difficult for search engines to process and index correctly. Essentially, if search engines are able to crawl three URLs containing the exact same content, they’ll be flagged as duplicates, leading to confusion as to which version should be displayed in the search results.

The simplest resolution to this problem within Magento is to ensure the site only uses top level product URLs. To do this, you will need to navigate to System > Configuration > Catalog > Search Engine Optimisation.

Once in that section, you will need to select ‘No’ in the ‘Use Categories Path for Product URLs’ drop-down:

magento1

That’s it, your products will no longer create new URLs depending on which categories they’ve been added to within Magento. However, it’s important to bear in mind that certain URLs may already have been picked up by Google, so to resolve this you will need to implement 301 redirects from the URLs that include categories, to the ‘top-level’ product URL.

You can read more about implementing 301 redirects here.

Query Strings

Like many other ecommerce platforms, Magento uses a faceted navigation system to allow users to drill down to specific product groups. As a result of this, query strings are added to the end of URLs, which can cause duplicate content if not handled correctly.

For example, the following URLs would, in theory, all point to the same content. The only difference here is that the products on these pages are filtered slightly differently:

  • http://www.example.com/category1?search=bikes&color=green
  • http://www.example.com/category1?sortby=reviews
  • http://www.example.com/category1
  • http://www.example.com/category1?gclid=ABCD

Again, if Google is allowed to crawl these URLs, the original (unfiltered) category pages performance will be hindered due to there being multiple versions of the page.

To resolve this issue in Magento, we need to specify a rel=canonical tag in the header of each page.

What Are rel=canonical Tags?

Canonical tags are a method of indicating the ‘preferred’ version of a piece of content to Google. Using the query string example we outlined above, we would need to notify Google that we only want them to index the URL without added parameters (http://www.example.com/category1). This way the filtered content can still be accessed by users browsing the site, but Google will understand that you only want them to crawl and index one version of that page – the unfiltered version:

canonical

How to Implement rel=canonical Tags in Magento

Many ecommerce sites have hundreds or thousands of different products, which would make the addition of canonical tags to each potential instance of duplicate content an extremely time consuming task. Luckily, Magento has a built in option for tackling canonical tags across a website.

To activate canonicals in Magento, simply navigate to System > Configuration > Catalog > Search Engine Optimisation.

In this menu, make sure the following options are enabled:

yes

Once these options are enabled and you have saved the config file, you should notice that categories with applied filters now display canonical link elements referencing the original category page, minus URL parameters:

canonical example

Avoiding Manual Cases of Duplicate Content in Magento

We have discussed the most common causes of duplicate content based on default Magento settings, but there are also numerous manual cases of duplicate content on the platform that will need to be avoided if a website is to rank well in Google’s results.

Product Variations

As is the case on many ecommerce sites, there will be products that have multiple variations based on colour, size, fabric etc. If your site is structures like this, you have two options to avoid duplicate content.

Option 1 – Create a Single Page and List All Its Variations on That Page

This way you would end up with one unique page instead of several duplicate pages. This is Google’s recommended method of displaying product variations, as quoted by Google rep John Mueller:

variations of  product “colors…for product page, but you wouldn’t create separate pages for that.With these type of pages you are “always balancing is having really, really strong pages for these products, versus having, kind of, medium strength pages for a lot of different products.

John Lewis single page
Single product page example taken from John Lewis’ website

Option 2 – Make Each Variation Page Unique

The hardest way to solve duplicate issues with product variants is to make each page unique. This involves adding different copy, meta data and imagery and is by far the most time consuming of the two options. Furthermore, there is little evidence that this method will outperform a single page listing all variants, so we would usually recommend option #1  to our clients.

Creating Additional Navigation Options

On numerous occasions we have found website owners have unknowingly created multiple URLs fort what is essentially the same category of products within Magento. For example, a website we audited recently had three different URLs for their ‘beds’ category, which had been created due to the multiple top level categories that ‘beds’ had been added to.

Here’s the first instance, with ‘beds’ being added to the ‘bedroom furniture category’ on the navigation:

beds2

And here’s the same page linked to from the ‘bedrooms’ section of the website, and again under the ‘beds’ section of the very same ‘bedrooms’ section:

beds1

Each of these URLs display the same content, same title tags, same headings and the same meta descriptions, which means that the website will not be ranking as well as it could be for that topic due to there being three URLs accessible to Google.

The remedy here would be to have one version on a topic, and link solely to that page from various areas of the navigation. This makes Google’s job much easier, as they know that the page about beds is the only page on that topic on the site, and thus the one they should index on that topic.

Conclusion

If you’re running Magento, I would strongly advise auditing your site against the issues listed above. Of course, there are many other duplicate content issues common on ecommerce sites, and I have only scratched the surface here with Magento specific examples.

If you’re operating an online store on another platform, I would advise checking out this guide that outlines common causes of duplicate content on ecommerce sites.