Skip to main content
Particularly LogoParticular.ly

Robots.txt Checker

Robots.txt Checker
Check and parse robots.txt file

About the Robots.txt Checker

The Robots.txt Checker fetches and parses the robots.txt file from a website's root, the plain-text file that tells search engine crawlers and other bots which parts of a site they may or may not access. Following the Robots Exclusion Protocol, this file uses directives like User-agent to target specific crawlers, Disallow and Allow to control path access, and Sitemap to point bots to the site's XML sitemap. The tool retrieves the file and presents its rules in a readable form so you can verify they say what you intend.

The checker reads the file located at the domain's /robots.txt path and breaks it into its constituent rule groups. Each group begins with one or more User-agent lines followed by the Disallow and Allow paths that apply to those agents. The tool helps you spot common mistakes, such as an accidental Disallow: / that blocks the entire site from indexing, conflicting rules, or a missing or unreachable file. It also surfaces any declared Sitemap URLs so you can confirm crawlers are being pointed to the right place.

The primary use case is SEO troubleshooting: confirming that important pages are crawlable and that private or duplicate sections are correctly excluded. Site owners check robots.txt after a migration or redesign to ensure a staging-environment block was removed before launch, since a leftover Disallow: / is a frequent cause of sudden traffic loss. Developers also use it to verify that admin areas, search-result pages, or faceted-navigation URLs are kept out of the crawl budget.

Keep in mind that robots.txt is advisory: well-behaved crawlers respect it, but it does not enforce security and will not stop a determined bot or hide a page from being indexed if other pages link to it. For genuinely private content, use authentication or a noindex meta tag instead. Pair this checker with a Sitemap validator to confirm your declared sitemap is healthy, a Meta Tags inspector to review page-level indexing directives, and an HTTP Headers check to look for X-Robots-Tag directives that complement the file.

Frequently asked questions

Where does robots.txt have to be located?
It must live at the root of the domain, at the exact path /robots.txt. Crawlers only look there; a file placed in a subdirectory or with a different name will be ignored and the site will be treated as having no robots rules.
Does robots.txt prevent a page from being indexed?
Not reliably. It controls crawling, not indexing. A disallowed URL can still appear in search results if other sites link to it, since the crawler can see the link without fetching the page. To keep a page out of the index, use a noindex meta tag or password protection.
What does Disallow: / do?
It blocks compliant crawlers from accessing the entire site. This is appropriate for staging environments but is a common and damaging mistake if left in place after launch, as it can remove a site from search results and cause a sharp drop in organic traffic.
Can I point search engines to my sitemap from robots.txt?
Yes. Adding a Sitemap directive with the full URL of your XML sitemap helps crawlers discover all your important pages. You can include multiple Sitemap lines, and the directive works independently of the User-agent groups in the file.