What are canonical tags?
A canonical tag is a signal in a web page which simply says: “I’m a copy of this other web page which can be found over there…”
This HTML element is added into the code of a page and helps prevent duplicate content issues by telling Google and other search engines the preferred version of a page.
How search engines deal with duplicate or similar pages
When most search engines come across two or more pages with very similar content they often choose just one of the pages to index, effectively ignoring the others. Which duplicate page URL they decide to choose could be based on a number of factors including whichever one was first crawled, which one has the most internal links or which one has the most external links.
Duplicate content is not only bad for SEO, but it can also harm conversion rates by not showing content in its best possible state.
There will often be situations where you have a number of pages which are vital to the infrastructure of your website but contain either similar or identical content. Rather than risking a penalty for duplicate content, you can add a rel=canonical tag on the page which you feel is the preferred source of information, as shown in the example below.
Example of product page duplication
Let’s look at an example of e-commerce product page duplication below:
All these product pages for the same ‘Red Toy Truck’ have exactly the same content with only insignificant variations such as the top breadcrumb links. The way this website has been structured means that there are three pages for one product, so canonical tags are required to eliminate two of the copies:
Using canonical tags, we have signaled to search engines that the original product page is located at [http://www.example.com/toys/trucks/red] and that the other two URLs are just mere copies. In this case it’s wise to choose the permanent product page which isn’t in the “sale” category (the sale is likely to end one day) and which isn’t in the “items under £10” category (the price may rise one day to over £10).
Other examples of page duplication which require canonical tags
Page URLs can have added information at the end of them in the form of parameters, these are always shown after a question mark in the URL:
- Non Parameter URL – http://www.example.com/blog
- Parameter URL resulting in completely different page content – http://www.example.com/blog?page=2
Sometimes URL parameters show completely different page content, other times they may filter out certain bits of content and other times they have absolutely no effect on a page’s content:
- Completely different content than in no parameter was used:
- Slightly different content than if no parameter was used:
- Insignificant change on the content with or without the parameter:
- No change at all on the content with or without the parameter:
Search engines can treat URLs with different parameters as different, unique pages. It’s then important to promote the best URLs by using canonical tags if the content isn’t significantly changed when parameters are added.
If you don’t want search engines to index certain patterns of URLs then make sure to use the robots.txt file to block them, this will eliminate tracking parameters from being indexed for example which may mess up overall statistics.
Many URLs for the same webpage content
There are many ways for a website to have multiple URLs for exactly the same web page. Luckily canonical tags can be used to self-reference a page to eliminate these issues. For example, a webpage could safely say that it’s a copy of itself and eliminate any confusion with parameters, sub-domains, etc. (see below).
Here are several examples of several URL variations for the exact same page on a website. It’s well worth checking these all out on a website and if needed, use canonical tags to reference the preferred URL:
- http://www.example.com/example-page (preferred URL)
These issues will be resolved once a visit to all the wrong URLs has a canonical tag linking to the preferred URL (http://www.example.com/example-page in this case).
The “www.” or issue can be resolved with a simple 301 redirect so that every time a URL is entered without the “www.” the user will be automatically redirected to the correct www. version
URL parameter filters
Things can get complicated when URL parameters filter out results on a web-page. You typically see this in action on e-commerce websites, local listings or other highly filterable results such as on property or holiday websites for example.
In these cases, you should ask yourself if directly visiting the parameter URLs enhance or limit the results shown and if search engines will think they are unique enough to index separately. In most cases you will not want search engines to index a parameter URL which filters as this could be shown in place of a non-parameter URL which shows all filter options and all products/items:
E-commerce website filters can reduce items down to a certain colour, sub-category, price range or review score. There would be little advantage in allowing Google to index a limited selection of your products within a certain category rather than all the products at once unless you have a significant number of products and there are active searches for such niche keywords.
The major search engines such as Google and Bing do a great job of understanding what parameters filter out content and what parameters completely change the content. If you are very unsure then it’s best not to create canonical tags for filter URLs and let the search engines automatically try and determine your URL structure.
How to set up a canonical tag
- Decide which page you want to be your preferred URL. This should be the version you think is the most important. If you don’t care, pick the one with the most links or visitors, and if all else is equal, just pick one!
- There are many plugins available to apply canonical tags if you are using a CMS such as WordPress or Magento, however, if you are going straight into the code you will need to add the following <link> to the <head> section of the additional pages, not your preferred page:
<link rel=”canonical” href=”https://www.example.com/hats” />
This will indicate that this is the preferred URL for users who want to access your hats page and will tell search engines that you would like them to show this page over your other similar hats pages. As with anything search engine related, Google specifically stated that “We attempt to respect this (canonical tags), but cannot guarantee this in all cases”.
As Joost de Valk at Yoast states in his handy guide:
What this does is “merge” the two pages into one from a search engine’s perspective. It’s a “soft redirect”, without redirecting the user. Links to both URLs now count as the single, canonical version of the URL.
Common mistakes with canonical tags
The misuse of canonical tags can result in dire consequences. Imagine if every page on a website claimed to be a copy of the homepage; search engines would de-index every page on the website and only show the homepage within search engine results!
Here are some common mistakes we have come across with canonical tags:
- Having a non-dynamic canonical tag on every page of the website pointing to one URL (an SEO killer!)
- Having two different canonical tags within the HTML code (only the first one is counted by search engines)
- Using the URL without the “http://” part (you should use absolute URLs)
- Pointing product pages towards the category pages they reside in (product pages need to be indexed separately)
- Using canonical tags on paginated URLs (see below for more information)
Paginated URLs are a sequence of URLs which display an order of information. Examples include stories, list of products, lists of blog/news posts, lists of information, etc.
Let’s say you wrote a great story online which spanned over four chapters on four different web pages. You could want search engines to only index and send people towards the first page of the story. To do this you would use a canonical tag to point every chapter page towards the first chapter:
This would effectively lose all the content shown in chapter 2 onwards, a huge amount of unique content which some search engine users may wish to jump straight towards or find quickly.
Previously, you could use the “rel” tags called “next” and “prev” which shows the relationship between all pages instead:
But this is no longer supported by Google:
This now means that paginated pages are treated just like any other page on your website in Google’s index. Rather than a series of pages consolidated into one piece of content, they are now treated as individual unique pages. Read this guide to find out what you need to do instead!
Should a page have a self-referencing canonical URL?
This question is a debated topic in SEO. For example at Yoast, they strongly recommend having a canonical link element on every page.
Also, John Mueller at Google has suggested previously that this is best practice.
A lot of CMSs will allow URL parameters without changing the content. So, for example, all of these URLs would show the same content:
Therefore, by implementing a self-referencing canonical, you can avoid any potential SEO/duplicate content risks,
Can you use cross-domain canonical URLs?
The simple answer is, yes, it is OK to use canonical URLs that point to another domain. For example, it may be that a piece of your content is published on another website as the webmaster feels that it would be relevant to their users. However, you must ensure that they implement a rel=canonical link back to the original article on your domain.
When to use 301 redirects instead of rel=canonical
301 redirects should be used whenever a page/domain permanently moves to a new destination. Say you head into Google Search Console and find a number of 404 pages – 301s will fix this. Simply find a new location, put the redirect together and upload it to the server.
Best practice is to find a perfect match for the new URL when implementing 301 redirects, this way the user will be served the same/better content and the page will be relevant for search engines to serve.
Unlike a redirect, a canonical tag does not tell the server to send a user to another page – this is a signal to search engines to show them the preferred page you would like your user to see. Often there are situations where multiple pages are required despite the content being very similar, the most basic example would be on an ecommerce site:
Page One – example.com/hats/alphabetical
Page Two – example.com/hats/price
Both pages serve the same content, are very useful to the user and have to be on the site to ensure your products can be listed by price and in alphabetical order. If the site owner decides it is better to server users the price page in search engines, a rel=canonical tag will be added to the alphabetical page to say “hey search engine, I want you to know that these pages are extremely similar but the one I would most like to serve users is the price page please”.
If you are unsure whether to do a 301 redirect or set a canonical, what should you do? The answer is simple: you should always do a redirect unless there are technical reasons not to. If you can’t redirect because that would harm the user experience or be otherwise problematic, then set a canonical URL.
Canonical tags can help improve SEO by ensuring that Google knows which of your pages it should consider most important and which should be indexed, but you need to be very careful when implementing them.
Most common CMS and ecommerce platforms now handle canonical tags automatically or have well-built plugins to do so, view the web page’s HTML source code and see if canonical tags are present (hint: use CTRL + F).
Have you found this article useful?
Get Team Hallam's expert advice and guidance straight to your inbox once a week.