Have You Implemented Noindex Tag On Content Which Doesn’T Need To Be Indexed?

If you have a site of few hundred pages then you might want have pages which exist with a sole purpose of serving the user. These pages might have very little content or even duplicate content.

  • For instance, a live link which is the login page for your internal team or customer base might be a page which doesn’t have much content.
  • A search or a filter page might need to be kept out of Google’s index as it leads to thousands of pages with duplicate content.
  • These might be the checkout page for an e-commerce site
  • For a blog, it might be WordPress tags
  • For a news site, it might be the comments page for each news item
  • Or you simply don’t want something openly discoverable on search engines

 

Ask Yourself These Questions

  • What are those pages which have the least amount of content?
  • Are there any files on your server which should not be part of Google’s index?
  • What are those pages which are designed just for the purpose of the user?
  • Which are those folders or sections within your website which look spammy? – Too many links
  • Are your file uploads creating specific HTML pages with URLs?
  • Do you want to keep your user profiles from showing up on search?
  • Any pages which are creating an infinite set of URLs? – Filters, search pages, session IDs?

Applying The Noindex Tag

In each of these cases, you will want to use a tag like Noindex. The NoIndex directive tells Google to not index the particular page. There are 2 ways you can use the NoIndex tag.

  1. Meta Noindex Header Tag
  2. X-Robots Tag

Meta Noindex Header Tag

The meta noindex directive is basically a tag which sits in the <head> section of the page. This is exclusively used for HTML pages. Typically the code looks like below.

<meta name=”robots” content=”noindex”>

<Image screenshot>

X-Robots Tag

But not all pages are HTML. To block out pdfs, images and any file from search engine index we use a tag called X-Robots Tag.

How you write your X-Robots tag depends on the type of server and the type of file you want to block.

If you want to block a PDF files on your apache server then the code below is apt.

<FilesMatch “.(pdf)$”>
Header set X-Robots-Tag “index, noarchive, nosnippet
</FilesMatch>

If you want to block a PDF files on a Ngnix server then use the code below.

location ~* \.(pdf)$ {
add_header X-Robots-Tag “index, noarchive, nosnippet“;
}

 

If you want to keep all images on your site private then you can apply this code to block search engine indexing.

<Files ~ “\.(png|jpe?g|gif)$”>
Header set X-Robots-Tag “noindex
</Files>

Note that applying the noindex tag doesn’t mean search engines will stop crawling those pages. For that, you should use the Robots.txt file to disallow bots from crawling specific files and folders. Follow this tutorial to see how.

Keep Learning

Block Search Indexing using Noindex Tag

Robots meta tag and X-Robots-Tag HTTP header specifications

Robots.txt Noindex: The Best Kept SEO Secret

After working 7+ years as a digital marketer for startups and large enterprises I quit my job to start EcommerceYogi. Here I share the exact same tactics which I have used to drive millions of users per month to e-commerce stores. Follow me on Linkedin and Twitter to stay connected.