Meta Robots Tag
An HTML meta tag that controls how search engines crawl and index a page.
The Definition
The meta robots tag is an HTML element placed in the <head> section that provides directives to search engine crawlers. Common values include 'index/noindex' (whether to include the page in search results), 'follow/nofollow' (whether to follow links on the page), and 'noarchive' (whether to cache the page).
Why It Matters
Incorrect meta robots directives can accidentally block important pages from being indexed or prevent search engines from following valuable internal links. A single misplaced 'noindex' tag can remove a high-traffic page from search results entirely.
Best Practices
Use noindex only on pages you genuinely want excluded from search results — tag pages, internal search results, admin pages
Check for X-Robots-Tag HTTP headers that may conflict with or override your HTML meta robots directives
Verify robots directives align with your canonical strategy and sitemap — inconsistencies confuse search engines
Use nofollow sparingly on internal links — it wastes PageRank rather than directing it
Test robots directives in staging environments before deploying to production to avoid accidentally deindexing pages
Regularly audit robots directives after CMS or theme updates, which frequently add unwanted noindex tags
Mistakes to Avoid
- 1
Leaving a staging environment noindex directive in place when migrating to production
- 2
Using noindex on paginated pages instead of proper canonical and pagination markup
- 3
Setting noindex, follow and expecting links to be followed — Google may eventually stop crawling noindex pages entirely
- 4
Conflicting directives between meta tags and HTTP headers causing unpredictable indexing behavior
Audit Checks
How Digispot AI identifies and fixes related issues
No meta robots tag found on the page
Impact: Without a meta robots tag, search engines have no explicit instructions for crawling and indexing, which can lead to: - Unintended indexing of private/sensitive pages - Incorrect handling of duplicate content - Wasted crawl budget on non-important pages
Add a meta robots tag with appropriate directives (index/noindex, follow/nofollow) based on the page's purpose.
Invalid or unknown directives in meta robots tag
Impact: Unrecognized directives may lead to misinterpretation by search engines
Ensure all directives in the meta robots tag are valid (e.g., noindex, nofollow)
Conflicting directives found in meta robots tag
Impact: Search engines may interpret the directives incorrectly, leading to unintended crawling/indexing behavior
Remove conflicting directives. Use either index or noindex, and either follow or nofollow.
Use of the "noarchive" directive in the meta robots tag
Impact: The "noarchive" directive may prevent search engines from storing cached copies of the page
Consider removing the "noarchive" directive unless caching needs to be restricted
Excessive use of directives in the meta robots tag
Impact: Using too many directives may complicate the page’s crawl and index behavior
Limit the meta robots tag to essential directives for optimal SEO control
Redundant "none" directive in meta robots tag
Impact: The "none" directive is redundant when other directives are present
Remove the "none" directive from the meta robots tag