#1 SEO Extension

150+ practices • FREE

Technical SEO

Robots.txt

A text file that tells search engine crawlers which pages they can and cannot access.

The Definition

Robots.txt is a plain text file placed at the root of your website (e.g., example.com/robots.txt) that provides instructions to web crawlers about which URLs they are allowed or disallowed from crawling. It uses the Robots Exclusion Protocol and can specify rules per user agent, including crawl-delay directives.

Why It Matters

A misconfigured robots.txt can block search engines from crawling critical pages, or waste crawl budget by allowing access to low-value URLs. It is the first file search engines check when visiting your site, making it a foundational element of technical SEO.

Best Practices

  • Test robots.txt changes with Google Search Console robots.txt tester before deploying

  • Include a Sitemap directive pointing to your XML sitemap for efficient crawl discovery

  • Be specific with Disallow rules — overly broad patterns like Disallow: / blocks your entire site

  • Do not use robots.txt to hide sensitive content — use proper authentication instead, as URLs can still appear in search results

  • Keep your robots.txt file at the root domain level and ensure it loads within 5 seconds

  • Review crawl stats in GSC to identify if your robots.txt is blocking important crawl paths

Mistakes to Avoid

  • 1

    Blocking CSS and JavaScript files that Googlebot needs to render your pages properly

  • 2

    Using robots.txt to prevent indexing — it only blocks crawling, not indexing (use noindex for that)

  • 3

    Having syntax errors like missing colons or inconsistent spacing that invalidate rules

  • 4

    Forgetting to update robots.txt after site restructuring, blocking new URL patterns

Audit Checks

How Digispot AI identifies and fixes related issues

View all robots.txt solutions
critical

No robots.txt file found at the expected location

Impact: Crawlers may access sensitive content or low-priority pages, affecting SEO

Add a robots.txt file with appropriate rules for your website

critical

No major crawler found in robots.txt

Impact: Crawlers may not crawl the website

Ensure that major crawlers are allowed in robots.txt

critical

Critical bot blocked by robots.txt

Impact: Critical bot may not crawl the website

Ensure that the critical bot is allowed in robots.txt

critical

No user agents defined in robots.txt

Impact: Crawlers may not be able to crawl the website

Add user agents to robots.txt

high

Syntax errors found in robots.txt

Impact: Crawlers may ignore invalid rules, leading to unintended content exposure

Correct syntax errors in robots.txt to ensure proper rule application

high

Sitemap is not referenced in robots.txt

Impact: Search engines may not be able to efficiently crawl the website

Add a sitemap reference in robots.txt