Canonical tags are a hidden message on a web-page which simply say: “I’m just a copy of this other web-page found here…“
How Search Engines Deal With Duplicate or Similar Pages
When most search engines come across two or more pages with very similar content they often choose just one of the pages to index, effectively ignoring the others. Which duplicate page URL they decide to choose could be based on a number of factors including whichever one was first crawled, which one has the most internal links or which one has the most external links.
Example of Product Page Duplication
Let’s look at an example of e-commerce product page duplication below:
All these product pages for the same ‘Red Toy Truck’ have exactly the same content with only insignificant variations such as the top breadcrumb links. The way this website has been structured means that there are three pages for one product, so canonical tags are required to eliminate two of the copies:
Using canonical tags, we have signified to search engines that the original product page is located at [http://www.example.com/toys/trucks/red] and that the other two URLs are just mere copies. In this case it’s wise to choose the permanent product page which isn’t in the “sale” category (the sale is likely to end one day) and which isn’t in the “items under £10” category (the price may rise one day to over £10).
How URL Parameters Can Result in Page Duplication
Page URLs can have added information at the end of them in the form of parameters, these are always shown after a question mark in the URL:
- Non Parameter URL – http://www.example.com/blog
- Parameter URL resulting in completely different page content – http://www.example.com/blog?page=2
Sometimes URL parameters show completely different page content, other times they may filter out certain bits of content and other times they have absolutely no effect on a page’s content what-so-ever:
- Completely different content than in no parameter was used:
- Slightly different content than if no parameter was used:
- Insignificant change on the content with or without the parameter:
- No change at all on the content with or without the parameter:
Search engines can treat URLs with different parameters as different, unique pages. It’s then important to promote the best URLs by using canonical tags if the content isn’t significantly changed when parameters are added.
If you don’t want search engines to index certain patterns of URLs then make sure to use the robots.txt file to block them, this will eliminate tracking parameters from being indexed for example which may mess up overall statistics.
Example of Many URLs for the Same Webpage Content
There are many ways for a website to have multiple URLs for exactly the same web page. Luckily canonical tags can be used to self-reference a page to eliminate these issues. For example, a webpage could safely say that it’s a copy of itself and eliminate any confusion with parameters, sub-domains, etc. (see below).
Here are several examples of several URL variations for the exact same page on a website. It’s well worth checking these all out on a website and if needed, use canonical tags to reference the preferred URL:
- http://www.example.com/example-page (preferred URL)
These issues will be resolved once a visit to all the wrong URLs has a canonical tag linking to the preferred URL (http://www.example.com/example-page in this case).
The “www.” or issue can be resolved with a simple 301 redirect so that every time a URL is entered without the “www.” it will automatically be added in.
URL Parameter Filters
Things can get complicated when URL parameters filter out results on a web-page. You typically see this in action on e-commerce websites, local listings or other highly filterable results such as on property or holiday websites for example.
In these cases you should ask yourself if directly visiting the parameter URLs enhance or limit the results shown and if search engines will think they are unique enough to index separately. In most cases you will not want search engines to index a parameter URL which filters as this could be shown in place of a non-parameter URL which shows all filter options and all products/items:
E-commerce website filters can reduce items down to a certain colour, sub-category, price range or review score. There would be little advantage in showing a limited selection of your products within a certain category rather than all the products at once, unless you have a significant number of products and there are active searches for such niche keywords.
The major search engines such as Google and Bing do a great job of understanding what parameters filter out content and what parameters completely change content. If you are in any doubt then it’s best not to create canonical tags for filter URLs and let the search engines automatically try and determine your URL structure.
HTML Code for the Rel Canonical Tag
Below is the HTML code for implementing the canonical tag, it should be placed within the <head> section of the code where it has no effect on the page’s content:
With an absolute URL:
<link rel=”canonical” href=”http://www.example.com/page-url” />
…or with a relative URL:
<link rel=”canonical” href=”/page-url” />
Common Mistakes With Canonical Tags
The misuse of canonical tags can result in dire consequences. Imagine if every page on a website claimed to be a copy of the homepage; search engines will de-index every page on the website and only show the homepage within search engine results!
Here are some common mistakes we have come across with canonical tags:
- Having a non-dynamic canonical tag on every page of the website pointing to one URL (an SEO killer!)
- Having two different canonical tags within the HTML code (only the first one is counted by search engines)
- Writing the URL down without the “http://” part if using an absolute URL (this forms incorrect URLs)
- Pointing product pages towards the category pages they reside in (product pages need to be indexed separately)
- Using canonical tags on paginated URLs (see below for more information)
Paginated URLs are a sequence of URLs which display an order of information. Examples include stories, list of products, lists of blog/news posts, lists of information, etc.
Let’s say you wrote a great story online which spanned over four chapters on four different web pages. You could want search engines to only index and send people towards the first page of the story. To do this you would use a canonical tag to point every chapter page towards the first chapter:
This would effectively lose all the content shown in chapter 2 onwards, a huge amount of unique content which some search engine users may wish to jump straight towards or find quickly. Instead of using canonical tags you can use other “rel” tags called “next” and “prev” which shows the relationship between all pages:
Chapter three for example(*) would need to reference chapter two as the previous page and chapter four as the next page using these tags shown below:
<link rel=“next” href=”http://www.example.com/story?page=4″ />
<link rel=“prev” href=”http://www.example.com/story?page=2″ />
Search engines understand the “next” and “prev” tags and will more likely show the first page of the sequence within any search engine results eliminating the need for canonical tags hiding rich content.
Canonical tags can help improve SEO and conversion rates, but you need to be very careful when implementing them.
Most common CMS and Ecommerce platforms now handle canonical tags automatically or have well built plugins to do so, view the web page’s HTML source code and see if canonical tags are present (hint: use CTRL + F).
If you are unsure how a website is represented in Google then try the site operand and look at which pages have been indexed and you may spot an opportunity, good luck!