Sitemap Checker
About the Sitemap Checker
The Sitemap Checker fetches and parses an XML sitemap or sitemap index file from any URL, then breaks it down into a structured, readable list of the URLs it contains. It understands the standard sitemaps.org protocol, including the optional lastmod, changefreq, and priority elements, as well as sitemap index files that point to multiple child sitemaps. This lets you confirm at a glance that the file search engines rely on for crawling is well-formed and contains the pages you expect.
Under the hood, the tool retrieves the file and validates its structure against the XML sitemap schema, distinguishing between a urlset (a list of page URLs) and a sitemapindex (a list of nested sitemaps). It reports the total URL count, surfaces malformed entries, and flags common problems like exceeding the 50,000-URL or 50MB-uncompressed limits per file that Google and Bing enforce. Because sitemaps are often generated automatically by a CMS or framework, silent breakage is easy to miss without a parser like this.
Common use cases include verifying a freshly deployed sitemap before submitting it in Google Search Console, auditing whether stale or noindex pages are still being advertised to crawlers, and confirming that a sitemap index correctly references all of its child files. SEO teams also use it after a site migration to make sure redirected or removed URLs are no longer listed. Pair it with the Sitemap Checker after every major content change so the discovery layer stays accurate.
A practical tip: a sitemap should only list canonical, indexable URLs that return 200 status codes, so cross-check entries against your Link Checker and robots.txt rules. Keep lastmod values honest because search engines learn to distrust sitemaps that report fresh dates on unchanged pages. If your site is large, split URLs across multiple sitemaps and reference them from a single index file rather than packing everything into one oversized document.
Frequently asked questions
- What is the difference between a sitemap and a sitemap index?
- A regular sitemap (urlset) lists individual page URLs, while a sitemap index (sitemapindex) lists the locations of multiple child sitemaps. Large sites use an index to stay under the per-file limits.
- How many URLs can a single sitemap contain?
- A single XML sitemap can hold up to 50,000 URLs and must not exceed 50MB uncompressed. Beyond that, split the URLs into several sitemaps referenced by a sitemap index.
- Should I include non-canonical or noindex pages in my sitemap?
- No. A sitemap should only list canonical, indexable URLs that return a 200 status. Listing redirected, blocked, or noindex pages wastes crawl budget and erodes the sitemap's trustworthiness.
- Does having a sitemap guarantee my pages get indexed?
- No. A sitemap helps search engines discover pages, but indexing still depends on content quality, crawlability, and the canonical signals on each page.
Check robots.txt file
Check for broken links on a page
Extract meta tags from HTML
Find favicon for a website
Check if images are optimized
Preview title, URL, and meta description in a Google-style snippet