Before understanding canonical tags you have to get what duplicate content it.
Duplicate content as per search engine crawlers is when 2 or more pages have the same or nearly similar content. There are four major issues which crawlers and users face when this happens –
- The wrong URL gets indexed – When a search engine crawler comes across duplicate content it might not be able to figure out the original page you want search engines to index. This can lead to the wrong version of the URL being indexed. For instance, a not so user-friendly URL like https://www.example.com/12357?user=money might get indexed over https://www.example.com/monkey-business.html
- Lack of link signal consolidation – Even if Google indexes both the original and duplicate pages the link value each page has is not consolidated on to a single page. This means none of the pages has the link equity and the ranking power which it should have.
- Lowered engagement on pages – Since search engines might index a lower quality duplicate page the user might land on a page which is not optimized. This can lead to wrong URLs being indexed leading to increased bounce rate and lowered conversion rates.
- Loss of link authority – If you have syndicated content across multiple publications you would want search engines to know that you are the original creator of this content. If not you might have to compete for search engine rankings for content which you created.
A canonical tag when applied right can solve all the issues above.
What is a canonical tag?
The canonical tag is one way of telling search engines that the URL in concern is basically a duplicate of the original page which can be found at this URL.
In other words, if there are 2 web pages A and B and page B is a duplicate of page A then there must be a canonical tag on page B which points back to page A.
The canonical tag is placed in the <head/> section of a page and doesn’t appear as a link on the page itself.
So in this case page B will have a canonical tag as below in it’s <head/> section.
<link rel=”canonical” href=”https://www.example.com/a”>
Here is an example of a canonical tag in action on https://www.amandawakeley.com/uk/coats-and-jackets/vilma-suede-bomber-red
Once you use a canonical tag Google understand the original source of content and transfers all link equity and ranking power to the original source of content. It might even deindex the duplicate pages and stick to a single version of the URL in its index.
Common reasons for duplicate content on your site
As you go through the list below see which of the cases hold true for you. If any of these cases hold true and you haven’t applied a canonical tag then you have some work to do.
- Printer Friendly URL Versions – If your site provides a feature wherein the user can read your blog posts on a printer-friendly URL version then your site has a potential for duplicate content. Such URLs host all the content of the original post so they need to have a canonical URL pointing to the original post.
- Redundant Content – If you have a product which has multiple URLs each representing a different color or style then most of the content on each of those pages will be same. In such cases, you have to pick a master variation and point all the remaining URLs to the master URL. If you are using a product or a manufacturer description across your entire site then this content could lead to near duplicates. Ideally, you should pick a single page and point all other duplicate URLs to it using a canonical tag.
- Syndicated content – To distribute your content to a larger audience you might decide to syndicate your content across multiple publications and news sites. Great! But ask the syndicating website to point a canonical tag back to the original source of URL. This gives your content more authority in search engines. Google honors cross-domain canonical tag and it is the best option if you are syndicating content.
- Capitalization, Uppercase, and Lowercase URLs – If your site generates multiple URLs each varying by type case then you would have generated quite a few duplicate URLs. The best way to combat this is by picking a single URL typecase which is usually lowercase URLs and pointing all other duplicate variations to this URL using a canonical tag.
- Pages and Products in multiple categories – This issue occurs for more than one reasons. Firstly If your CMS creates URLs with category and subcategory names in its URLs then you could URLs like these. https://www.example.com/apparel/tshirts/boyhood-tshirt and https://www.example.com/boyhood-tshirt. Secondly, if your product is listed under multiple categories then you could create multiple URLs. In this case, your site could have URLs like https://www.example.com/tshirts/boyhood-tshirt and https://www.example.com/dresses/boyhood-tshirt. On both these cases, you are creating duplicate content because of a technical glitch. For e-commerce software like Shopify or Bigcommerce, this issue is there by default. For Magento, you could use this tutorial to implement a canonical tag.
- Filter parameter variations – If your site uses filters then you might be creating URLs based on the order in which the URLs selected. This can lead to URLs like https://www.example.com/tshirts?colour=red&style=printed and https://www.example.com/tshirts?style=printed&colour=red. Since these 2 pages will have essentially the same content it is wise to use a canonical tag pointing one from the other.
- URL Termination – If your site is built on a not so popular CMS it might not prepared to unify URL templates. Part of this can be seen when the same URL is terminated with multiple strings with each URL showcasing the exact same content. Example: /announcements.html, /announcement.php, /announcement/ and /announcement
- Mobile or AMP Pages – If you have different versions of the site for desktop and mobile versions then you have to tell Google which of those URLs are pairs. For this, add a link rel=”canonical” tag on the mobile page/URL pointing to the corresponding desktop URL.
How to implement canonical tag on popular CMS platforms
Things to remember about canonical tags
Although canonical tags can be a savior for your site’s SEO a canonical tag which is implemented wrong can backfire in a huge way. The problem gets more complex as the site increases in size and the number of URL templates increases multifold. Here we look at situations where most webmasters go wrong with canonical tag implementation.
- HTTPs header for documents – Canonical tags can also be implemented on PDF document URLs. If your site sells furniture and you have PDF brochures for each type the content might be redundant. In such cases, you can use an HTTP response header which points a canonical tag to the original content. To implement this you have to use Link: <http://www.example.com/downloads/chair-brouchure.pdf>; rel=”canonical” as an HTTP response header.
- Pick Secure over Non-Secure Protocol – Google suggests that you pick HTTPs version of the site as the original by using a canonical tag on the HTTP version of your site. Ideally, you want to 301 redirect your HTTP protocol URLs to HTTPs. If you choose to have both versions of the URLs then point a canonical from your HTTP URL to the HTTPs version of the same URL.
- Canonical Tag should not point to noindex page – When you do this you are directing the search engine crawler to a URL which is not supposed to be indexed. This gives a mixed signal to Google. This doesn’t let Google handle duplicate content right.
- Canonical should not point to non-200 page – If your page canonical tag is pointing to a dead page(4XX or 5XX) or a web page which redirects then you are throwing away your page’s link equity. In few cases, Google might stop honoring your canonical tag leading to inefficient handling of duplicate content on your site. To find all such in DeepCrawl SEO tool you have to visit – Config > Canonical > Canonical to Non 200.
- Do not use it for pagination URLs – Another common tendency in webmasters is to point all pagination URLs to the first page. This is not correct as you are pointing to a URL which is NOT a duplicate page. For this, you have to use Google’s recommendation of Rel=”next” and “prev” tags to make it search engine friendly.
- Link to Original URLs – All though you mention Google the original canonical URL it is up to Google to honor the tag. For best results, you have to make sure your internal links and the links in the sitemap point to the original URL and not the duplicate URL. In doing so you are sending a stronger signal to Google to pick the original URL and ignore or consolidate the duplicate ones.