Canonical links. Canonical URLs in WordPress: When and How to Use Them Page Status Non-Canonical

The rel \u003d “canonical” attribute is one way to deal with duplicate content. It is placed on any HTML page between tags ... Search robots begin to consider the page specified in the rel \u003d “canonical” attribute as priority (canonical). The canonical page will be displayed in the search, link weight and other characteristics of pages with the same content will go to it.

Thus, if your site has identical or very similar content available at different URLs, using the rel \u003d “canonical” attribute, you can specify the URL that is preferred for indexing.

When to use canonical links

1. To prevent the appearance of various duplicates. For instance:

  • sort pages: / * sort, asc, desc, list \u003d *;
  • duplicates due to UTM tags: * utm_source \u003d, / * utm_campaign \u003d, / * utm_content \u003d, / * utm_term \u003d, / * utm_medium \u003d;
  • other pages with GET parameters in the URL;
  • duplicates as a result of the peculiarities of the CMS (engine).

In this case, you need to add the rel \u003d “canonical” attribute to all static pages of the site. For example, for the page https://site.ru/category-1/page-2, rel \u003d “canonical” will look like this:

href \u003d “https://site.ru/category-1/page-2” /\u003e

2. For pages with very similar content available at different URLs.

For example, it can be pages of one product series, which differs only in color, or product pages, which are located in several categories at once.

In this case, you need to point from all pages rel \u003d “canonical” to the main, priority page.

In this case, on each of the pagination pages you need to specify the canonical "Show all" page.

For example, for the page https://site.ru/category-1/page-2, you need to write the canonical URL:

ru / category-1 / show-all ”/\u003e

How do I specify the main URL using the rel \u003d “canonical” attribute?

Prescribe between any HTML page tags

This is the main way. To specify a canonical link, place between tags on the page, the full URL of the page to be indexed.

For example, for the page https://site.ru/*utm_content\u003d the canonical will be https://site.ru/.

To get such a result, on the page https://site.ru/*utm_content\u003d we specified the tag:

ru /” />

Important!
To reduce the chance of error in link elements, use absolute links after the rel \u003d “canonical” attribute, not relative links.

Sitemap

In an XML sitemap, you can write the canonical (main) URL for any page.

Important!
The rel \u003d "canonical" attribute is a search engine recommendation, not a rule. In this case, the PS can ignore them.

In the HTTP header

Best used for non-HTML documents. For example, for PDF files.

In this case, the server, when requesting a duplicate file, must give a link to the original file:

Link: ; rel \u003d "canonical"

Important!
This method is suitable if you have access to the server settings. Not recommended for HTML documents.

Using a plugin

There are various plugins for the CMS that allow you to customize the canonical URL. For instance:
- canonical can be customized for WordPress using Yoast SEO;
- in OpenCart - implemented in the CMS settings (you need to go to the product settings and set the SEO URL parameter);
- to configure the canonical attribute in Joomla (version 3.x and higher), you need to enable the SEF function in the CMS settings. Once enabled, the rel \u003d "canonical" attribute will be added for technical pages of the /index.php?option type (indicating the URL to the page with the configured CNC).

How to check if rel \u003d “canonical” is configured correctly?

You can carry out the analysis with a special program for SEO-site analysis -.

With this program, you will see:
- what pages on the site without the rel \u003d “canonical” attribute;
- which pages have the rel \u003d “canonical” attribute, and which pages are canonical for them;

The main mistakes of using rel \u003d "canonical"

- Canonical URL gives 404 error.
- The specified canonical URL is on a different domain or subdomain.
- The canonical link is not indexable.
- Using rel \u003d “canonical” from pagination pages to the first page.

For all pagination pages it is wrong to write the canonical first page. This makes indexing of all pagination pages impossible.

For pagination pages, the same pages must be specified as canonical pages.

For example, the page https://site.ru/category-1/page-2 should contain a canonical link:

.

- Multiple rel \u003d “canonical” links from one page.

There must be one canonical page per page, otherwise only the first URL will be taken into account.

- Various canonical URLs.

Specify the same canonical pages in different ways of implementing the attribute (for example, via an XML sitemap and via rel \u003d “canonical” on the page itself).

Conclusion

The rel \u003d "canonical" attribute is a convenient and useful tool for search engine promotion. If used correctly, it will increase the efficiency and speed up the indexing of the site, which, in turn, will significantly affect its ranking.

Subscribe to the newsletter

SEO Analyst

I have been optimizing sites since 2009. I love complex cases that were too tough for specialists from other companies. I do very detailed audits.

I am writing instructional articles for the SiteClinic blog on SEO tools and analytics.

Favorite quote: To be successful, you have to truly love what you do.

We have released a new book, "Content Marketing on Social Media: How to Get Into the Heads of Subscribers and Fall in Love with Your Brand."

Canonical URL - a helper in the fight against duplicate content

Many modern CMS (site content management systems) can create. This leads to the fact that a site page can exist on the network under two or more different addresses. Search engines have a negative attitude towards duplicate content and lower it in the SERP. Therefore, one of the primary tasks of a webmaster is to get rid of duplicate pages in any way possible.

More videos on our channel - learn internet marketing with SEMANTICA

Example of duplicating a web document

The home page of the Internet resource can be accessed at several addresses:

  • primer.ru
  • ru / index.php

The search robot recognizes these addresses as four different web documents with identical content.

What is a canonical URL

Attribute allows you to specify the canonical, that is, the main version of the document to the search engine. This attribute will need to mark not only the main promoted page of the Internet project, but also its duplicates. If the robot finds copies of the canonical page on the site, it will mark them as insignificant. Canonical is the easiest method to deal with duplicate content.

How Canonical Link Works

Suppose we have a main page http://yoursite.ru/statya1, which can also be found at several more addresses:

To indicate the canonical page to the search engine, you need to add the following line to the code of each of the above documents:

This piece of code should be placed between the tags ... This will increase the chances that the search results will show the main document and not duplicates. It should be noted that the rel \u003d "canonical" attribute is taken into account by most modern search engines.

Why CMS create duplicates

Don't assume that your content management system is intentionally generating duplicate pages. Usually such copies are created due to incorrect CMS settings. The most common causes of duplicates include:

  1. creating archives from old articles;
  2. availability of open links to documents in the PDF version (for printing);
  3. incorrect site structure, adding the same pages to different categories;
  4. the presence of dynamic URLs (typical for online stores).

To identify duplicate pages and use the rel canonical attribute, you can use Google's Webmaster Tools. You need to go to the tab "view in search" and click on the link "Html optimization". The section that opens contains pages with duplicate meta descriptions. Such documents often have duplicate content.

Google's PS advises against specifying the rel \u003d "canonical" attribute as a directive for robots.txt. This can cause problems with site indexing. You cannot specify different canonical urls for one page (for example, one URL is in the sitemap, and the other is directly in the section pages).

To reduce the likelihood of errors when indexing a site, you should specify absolute, not relative, paths as the link rel attribute. In other words, instead of the structure / blog / page-1, you need to use the full URL http: // yoursite / blog / page-1.

Canonical URLs are a mystery to many people, and as such, many may misuse such URLs to set, for example, 301 redirects. People assume this feature is SEO related, but they don't know when or how to use it. In WordPress in particular (compared to a regular HTML site) it can be quite difficult to manually set canonical URLs for each page of the site without resorting to plugins due to the work of the theme templates available in the content management system.

In this article, we will help resolve some of the custom issues associated with canonical URLs. Users who do not use WordPress may also find this article helpful as it will offer basic canonical URL principles that apply to any content management system or development method.

Please note that this article may seem daunting to you if you don't have technical skills related to WordPress, basic HTML, or SEO. We will introduce you to the basic terms first. If suddenly something in the article seems incomprehensible to you, you can always search the search engine for answers to your questions.

What is a canonical URL?

A canonical URL (often described as rel \u003d canonical, canonical tag, etc.) is what search engines use when referring to content on your site when the content page has multiple versions on your site or even on the web. Canonical URLs are used today to solve some of the tricky duplicate content issues, and sometimes this feature is used to set up 301 redirects.

Google offers an excellent explanation of the purpose of canonical URLs. I highly recommend looking into it. They made it as clear as possible.

You might think that your site does not have duplicate content. It's great if you've made sure that your content doesn't repeat itself on different pages. Otherwise, it can result in a drop in your search results.

If you decide to duplicate text on your site, think about it seriously: if you were a search engine trying to answer a user query, would you present the user with two identical pages in the search results? No! It is useless for humans. Instead, you would offer as many different search results (SERPs) as you can find, which, accordingly, would fully meet the search needs of people.

Thus, if you are duplicating content on your site, you can - and should - expect Google to not rank all of your pages. However, this is not good if you only think about search engine metrics and SERP presence.

Duplicate URLs You May Not Know About

Okay, let's go back and assume that we have verified that our site pages are unique. However, you may still have some "hidden" duplicate URLs that you simply do not know about (in reality, they are certainly not hidden). This may surprise you, but you should know - search engines see the following URLs as completely separate, separate, even though they are displaying the same content:

  • http://www.examplesite.com (noticed www?)
  • http://examplesite.com
  • https://examplesite.com (noticed https?)
  • http://www.examplesite.com/ (notice the slash at the end?)
  • http://examplesite.com/index.php

This is why we need the canonical URLs in the HEAD tag of the HTML of all your pages. You have to tell the search engines which version of all the above URLs (and other versions) they should look at.

Yes, you have to make the final decision about whether you are going to use www or not in all of your links in the web marketing process. You must adhere to the same linking strategy throughout your site, and even beyond. Everyone who uses your URLs needs to know this: the employees, the partners, the directories you are listed in, the people linking to you, everyone.

You will also need to decide if you will use a slash at the end of the URL, and if you will use https (if you accept important information on the site, such as credit card information). Pick one option and stick to it. If I were you, I would choose the one that is used most often to avoid the headache when fixing my URLs.

Fortunately, if you are using WordPress, then most of these problems can be solved. We'll look at different plugins and other things to help you deal with this.

However, there are other places where canonical URLs are very useful.

Duplicate content generated by taxonomies

Let's say you're writing an article, and you include that article in multiple blog categories with different tags in WordPress (these are called taxonomies). People always do it. Or let's say you are an e-commerce business and your products appear in numerous categories. We have a problem: content can be presented multiple times at different URLs, which makes it easier for users to navigate the site. For instance:

  • http://examplesite.com/store/candy/chocolate-truffles
  • http://examplesite.com/store/foods/chocolate-truffles

You want your users to be able to find chocolate truffles under two headings: "candy" and "food". It's fine. But which of the two URLs should be indexed by search engines? Remember, they won't rank for both URLs. So you have to choose this yourself. This is where canonical URLs come to the fore. Such URLs will tell search engines, “hey, this content is exactly the same as on another page; please index it. "

Remember that no search engine is obligated to obey this canonicalization, and they may ignore it if they deem it wrong.

Using cross canonical URLs when duplicating content from other sites

There is the most important reason why you need a canonical URL. We will talk about it below (there are others, but they are more complex, while the principle is the same). It so happens that you publish content on your site that also appears on other sites. The simplest example of such a situation is syndication (for example, press releases).

Let's say your company publishes a press release and sends it to your website. This is completely normal. However, press releases work like this: they can be used free of charge by any content publisher. They are specially created to be copied and distributed. There are even entire syndication networks like PRWeb. This is a fairly old form of marketing.

However, it creates SEO problems. For a search engine spider, the content of a press release on your site is exactly the same as the content of a press release on other news sites. How do you know where the original is? What URL to display in SERP (search results)? Remember - you have to choose it.

Usually search engines choose it on their own if you don't offer them anything. And such a proposal is made using the canonical URL. In the case of press releases, however, it is unlikely that every small news magazine will have a canonical URL pointing to your site. Remember that many simply do not know about this. I doubt they would rush to specify the original content source and HTML coding accordingly. They publish several different articles a day.

Thus, you have to take care of this on your site. If I were you, I would use the canonical URL on the page containing your press release and link to the copy on the main syndication network where you published the article for later distribution. For example, you can link to a copy of the article on PRweb.com (if you use this service).

If you want to see a live example of a non-press release situation that also affects canonical URLs, let's take a look at the following article I wrote for KISSmetrics a year ago:

Soon after, Entrepreneur.com took over this article because they had an agreement with KISSmetrics (remember, they had permission!)

We now have the same content available at two URLs. Technically, this is duplicate content, which is bad! However, don't be afraid. If you look at the source code for the article on Entreprenuer.com, you will find the following there:

This tells search engines where the original content was submitted, which is the right decision. It also removes suspicions of content theft in the eyes of search engines (who may not be aware of your legal rights to publish the work).

However, you shouldn't create an entire site of other people's articles. In this case, the canonical URL is unlikely to help you rank. So don't overuse this tactic.

When you can't use canonical URLs for external duplicate content

I want to talk about this because I often face such situations. If you want to display a company description or personal bio on your site, I do not recommend using the same words and phrases that you use on your social profiles or elsewhere on the network.

If you provide the same description that you have on LinkedIn or your Google Plus business page, then you are essentially duplicating content. You shouldn't use the canonical URL on the About page, and link your social network profile to it. Your About page should rank on its own. In such a case, please use a unique description for external use. I do this for all my clients.

How to use canonical URLs in WordPress

There are several ways to do this, but I'm going to show you the best one I use myself: just use the WordPress SEO plugin from Joost De Valk.

Once you install this plugin on your site, it allows you to take care of numerous SEO metrics, including canonical URLs. However, the plugin offers other settings that you need to pay attention to.

In the screenshot below, you can see that on the edit screen for a single post or page (the plugin works for custom post types as well), the WordPress SEO dashboard offers a ton of options and fields. To set canonical URLs that you can use for different things - press releases or external duplicates of content - go to the Advanced tab:

Click the dropdown to select the type of canonical URL in the head tag of all your pages:

When not to use a canonical URL

First, read the following on the Google Webmaster Central blog about common rel \u003d canonical URL errors. Make sure you - or your developer - don't allow them. Second, do not use canonical URLs in the following situations:

When you want to make a 301 redirect

If you want to redirect one page to another so that users who entered the old URL or clicked on the dead link are redirected to the new URL, you need to use a 301 redirect. Don't use canonical URLs for this. However, in SEO, they are often applied this way.

A redirect means that there is only one place where the content is presented, and you are forcing users to go to that page. This is suitable, for example, if you migrated your site to a new domain or set up a new URL structure due to a site redesign. You can also use a 301 redirect to send people to the www or non-www version of your site (this will make sure no one goes to your site at the wrong address).

Thanks to canonical URLs, you can have the same content on different pages on the web, and have one "original" content source. In other words, different pages containing the same content can exist and be viewed by users.

However, back in 2011, Rand Fishkin conducted an interesting experiment in which he used the canonical URL in the header of all pages in the old domain to improve the ranking of another new domain. And it worked. He told about this story in his entry. The post also explains why canonical URLs are so important to cross-domain content syndication in the SEO world. I don't think this will work today, but you can try it as an experiment.

When you want search engines to ignore your page

Remember rel \u003d canonical is not a solution to duplicate content problems. Search engine optimization is much more difficult and sometimes a better solution is to use a robots file to block pages from indexing. This is why the WordPress SEO plugin includes related options.

I recommend that my clients block some pages from indexing that are not useful to visitors. For example, why do you need a Terms and Conditions page, a login page? They shouldn't be there. Better to make way for more valuable content. This includes sales pages, product descriptions, and informative blog posts.

I also recommend using the no-index rule for pages with very little content (because your site looks too lackluster then) and for archives that duplicate content. In WordPress, this applies to author archives, date archives, and, in my case, tag archives (since they contain the same content as category archives). You can also block arbitrary post types and their archives from indexing if they only transfer content from other pages on your site.

Note: if you close something from indexing, then you will need to remove this content from the sitemap as well, otherwise it will lead to errors in Google Webmaster Tools.

Adjusting the URL to match the canonical index

Remember when we said above about choosing one version of the URL that will be used in all links from now on? Great, once you do that, you will need to "clean up" or fix the URLs on your site and elsewhere so that they link to the version of your choice. Let's say you've made the decision to use the non-www version on your site. Now you need to make sure that all external and internal links use the version you selected. If not, then you should try to change the URLs. Yes, this may seem like a rather complicated action, but it's worth it.

To quickly replace all URLs on your site, you can use a tool like Search Replace DB. However, only use it if you understand what you are doing.

There are also plugins that allow you to do search and replace through your WordPress dashboard. After the replacement is done, delete everything connected to your database to avoid unnecessary security risks.

To deal with dead links in posts and pages, use a plugin like Redirection to do the job for you.

When you've done all this, make sure to sign in to your Google Webmaster Tools account and set the preferred URL for your site. Also, submit both www and non-www versions of your site to Google Webmaster Tools to set preferences.

Conclusion: Use Canonical URLs for SEO Benefits

Hopefully we have overcome the chaos around canon links and their impact on SEO. If you still do not understand all aspects, I recommend following the links provided in this article. Best of all, though, now you know how to use canonical URLs, you know they have the potential to deliver great SEO results.

Duplicate content is a problem that not all business website owners can deal with. Sometimes they just don't have time to deal with this problem. Fortunately, search engines are realizing that sometimes the same content can be accessed at different URLs, and it's perfectly legal. The search engines have offered us a tool that we can use - so let's use it to our advantage!

Today we are going to talk about the attribute Rel \u003d "Canonical" and in what cases it should be prescribed.

What does Rel Canonical mean?

This attribute is specified in the tag and is used to display canonical pages on the site. Canonical page - is the main page, it will be on the site.

Canonical history

On February 12, 2009, Google introduced the canonical attribute, which was created to rid a site of duplicate pages by specifying the required URL (canonical page).

Where to prescribe rel canonical

Tag attribute rel canonical is written in the section and can only appear once per page. This tag cannot be indicated in other sections on the site or

.

Canonical link - what is it?

Canonical page- a higher priority page compared to the rest.

Let's look at a specific example:

We have a canon page that we want to link to in the case of takes.

The link tag with the canonical attribute will be be a canonical reference.

We can solve this problem in another way. ReDirect 301 will go from the duplicate pages to the main (canonical page). I'll talk about 301 redirects in the next article. .

For the main page, the canonical attribute is optional, since we specify the Host directive in robots.txt.

What to do with pagination in an online store?

Do I need Canonical with product cards? Yandex has already answered this question:

“If there are a large number of products in any category on your site, pagination pages (sequential pagination) may appear, which contain all products in this category. If there is no traffic from search engines to such pages and their content is largely identical, then I advise you to configure the rel \u003d "canonical" tag attribute on such pages and make the pages of the second, third and further numbering non-canonical, and indicate the first page of the catalog as the canonical (main) address, only it will participate in the search results.
For example, the page site.rf / chamomile / 1 is canonical, the catalog begins with it, and pages of the form site.rf / chamomile / 2 and site.rf / chamomile / 3 are non-canonical, you can not include them in the search. This will not only prevent possible duplicate content, but also allow the robot to indicate which page should be in the search results. "

Then if the pagination pages are like:

http://site.ru/ category-name / ”/\u003e
I recommend that you familiarize yourself with the following list of articles, which will help to make the internal website optimization efficiently:



As a rule, problems with duplicates arise from the platform side (most often these are the well-known CMS Joomla, Opencart and others). Opencart can generally be placed in a separate category, since the problem has not been fully resolved, in any case, I could not figure it out and had to transfer everything to another engine. (I do not recommend this engine to anyone)


But there are also errors from the optimizer that I described below.

Common Canonical spelling mistakes

Invalid server response.

The page on which we put the link rel \u003d "canonical" tag must be functional. That is, the server response should be 200.

Check robots.txt

It is possible that this page is closed from indexing by search robots.

Duplication and location of the attribute.

It is important that the tag appears only 1 time on the page and is located only in the tag .

Lack of link chains.

From all duplicate pages, all canonical links point to 1 canonical page. So that it does not happen that one link links to another, the second to the third, and so on.

Rel Canonical in CMS Wordpress

You can register the Canonical tag in CMS Wordpress using plugins:



SEO Yoast has limited functionality, you can only specify the required canonical URL in the field.

All in SEO plugin



In All in SEO it is possible to also specify the canonical URL, as well as prohibit pagination for canonical URLs (that is, the search engine will not index these pages).

Conclusion

Given the problems of many platforms, this attribute should definitely be taken into account, because it affects the indexing of your website pages.

I think I covered the most part about the Rel \u003d "Canonical" attribute. Write in the comments how you use canonical links on your sites.

Search engines are very negative about duplicate content and are constantly struggling with this problem. The uniqueness of the content is its main value, and it is easy to get penalized for copies. To avoid this, you can use several methods of dealing with duplicates. In this article, we'll take a look at one of them - canonical URLs.

There are several reasons for the formation of duplicates, for example, CMS can create additional copies, where the page is available at the address with www and without. Copies are especially common in online stores, where product pages differ only in photographs.

Canonical URL - this is the preferred page address, that is, it will be indexed from the group of similar ones.

Canonical URL against duplicates.

Let's say there are several URLs leading to the same page:

  • mysite.ru/main
  • mysite.ru/blog/2364
  • mysite.ru/blog/page?id\u003d2364

If we want to index only one of them, we need to use the attribute rel \u003d canonical.

For example, if the main page is mysite.ru/main, then the following line will appear in the code of the other two:

It should be noted that search engines do not guarantee one hundred percent adherence to this rule. However, if you do not specify the canonical page, then the PS can do it itself. In this case, you will lose control over indexing, since the search robot will select the page at random and enter it into the index.

The rel \u003d canonical attribute should not be overused either. There were sites that lost their positions in search results after developers mistakenly wrote the same url in rel \u003d canonical of all site pages.

How to use canonical urls correctly?

  • Select the main page (canonical).
  • Use the rel \u003d canonical attribute to point to it from other duplicate pages. It is important to write absolute paths: http: // mysite.ru/blog/page?id\u003d2364, not / blog / page? Id \u003d 2364.
  • Specify canonical pages in the Sitemap.xml file. This does not guarantee correct indexing, but it will help the crawler determine which pages should be considered main.

What's the difference between a canonical link and a 301 redirect?

The difference is in the principle of their action. The rel \u003d canonical attribute tells the search engine which page to index and display in search. The rest of the pages are not ranked, but are visible to the user on the site. When using a 301 redirect, you are automatically redirected to the main page. From the point of view of weight transfer, both options will transfer a certain amount of weight to the canonical page.

Using rel \u003d canonical and 301 redirects at the same time can be a bad idea. We're talking about cases when you point to a page as canonical, redirecting from it, in turn, to another with a 301 redirect. Most likely, the search robot will consider this a mistake. It is possible that the transferred weight will be lost within this chain, which will lead to the loss of positions in the search results. It is better not to chain canonical links, but to use only within one step to the main page.

And a few more rules

  • Do not cover canonical URLs in your robots.txt file.
  • Make sure that the main URL in Sitemap.xml and rel \u003d canonical match.
  • Only one canonical can be specified on a page.
  • You should not specify a canonical page from a different domain.

The use of canonical URLs is optional. But if you have duplicate content, it's best to fix this problem yourself. Otherwise, the search engine will solve it in its own way.