Sitemap Errors in Google Search Console: What Each One Means and How to Fix It

·7 min read

When Google Search Console flags your sitemap with "Couldn't fetch," "Sitemap could not be read," or "Has errors," it usually isn't telling you the whole story — it's giving you a one-line status for a problem that could have a dozen causes. The good news: each status maps to a short list of real issues, and most are fixable in a few minutes once you know what you're looking at.

Here's every common Search Console sitemap status, what's actually breaking behind it, and how to fix it.

"Couldn't fetch"

This is the most common — and most misread — status. It does not mean your sitemap is malformed. It means Google tried to request the URL you submitted and didn't get a clean response back. The XML itself might be perfect.

Walk these in order:

  • Wrong URL. You submitted sitemap.xml but the file lives at sitemap_index.xml or /sitemaps/sitemap.xml. Open the exact URL you submitted in an incognito browser. If you don't see XML, that's your answer.
  • It's blocked by robots.txt. A Disallow rule can stop Googlebot from fetching the sitemap even though it loads fine for you. Check that your robots.txt isn't disallowing the sitemap's path, and add a Sitemap: line pointing to the full URL.
  • A redirect. If the sitemap URL 301s or 302s to another location, Google often reports "Couldn't fetch." Submit the final destination URL directly — no redirect hops.
  • It returns the wrong status code. The sitemap must return a 200 OK. A soft 404, a 403, or a page that loads visually but serves a 404 header will all fail. Check the real HTTP status with the HTTP Header Checker.
  • It's freshly submitted. Sometimes "Couldn't fetch" is just timing. If everything above checks out, wait 24-48 hours and recheck before chasing ghosts.

The fastest triage: load the submitted URL yourself, confirm it returns raw XML with a 200 status, and confirm robots.txt isn't blocking it.

"Sitemap could not be read" / "Has errors"

Now the file was fetched, but Google can't parse it. This is a formatting problem in the XML. The usual culprits:

  • It's not actually XML. A server misconfiguration serving the sitemap as text/html, or an error page returned in place of the file, breaks parsing immediately. The Content-Type should be application/xml or text/xml.
  • Invalid characters. A raw ampersand (&), <, or > inside a URL will break the XML. These must be escaped — & becomes &amp;, for example. One unescaped character anywhere in the file fails the whole sitemap.
  • Missing or wrong namespace. The opening <urlset> tag needs xmlns="http://www.sitemaps.org/schemas/sitemap/0.9". Drop it or typo it and the file won't validate.
  • Malformed dates. <lastmod> values have to be valid W3C datetime format — 2026-06-22 or 2026-06-22T14:30:00+00:00. A date like 06/22/2026 or June 22, 2026 is a parse error.
  • Byte order mark (BOM) or whitespace before the declaration. The <?xml version="1.0" encoding="UTF-8"?> line must be the very first thing in the file. A stray space, blank line, or invisible BOM in front of it can break parsing.

Before resubmitting, run the file through the XML Sitemap Validator — paste your sitemap and it flags unescaped characters, invalid dates, namespace problems, duplicate entries, and size-limit violations so you fix everything in one pass instead of resubmitting blind and waiting.

"Sitemap is HTML"

A specific flavor of the parse error worth calling out, because the fix is different. Google fetched your URL and got an HTML page instead of XML. This almost always means one of two things: you submitted an HTML sitemap (the human-readable list of links some CMSs generate) instead of the XML one, or your server returned an error page. Find your real XML sitemap — in WordPress it's typically /sitemap_index.xml (Yoast/Rank Math) or /wp-sitemap.xml (core). Submit that, not the HTML version.

URL and Size Limit Errors

Even a valid sitemap fails if it breaks Google's hard limits:

  • 50,000 URLs max per sitemap file.
  • 50 MB uncompressed max file size.

Cross either and Google rejects it. The fix is a sitemap index — a parent file that lists multiple smaller sitemaps, each under the limits. Split by content type (sitemap-posts.xml, sitemap-pages.xml, sitemap-products.xml) and reference them all from one sitemap_index.xml. Submit only the index in Search Console; Google crawls the children automatically. If you're building these by hand, the Sitemap Generator produces correctly formatted files, and the validator confirms each stays under the caps.

"Discovered URLs" Lower Than Expected

The sitemap reads fine, but Search Console shows far fewer URLs discovered than you submitted. This isn't a parse error — it's a content mismatch. Common reasons:

  • Relative URLs. Every <loc> must be an absolute URL — https://example.com/page, not /page. Relative paths get silently dropped.
  • Mixed domains or protocols. URLs in the sitemap must match the property exactly. Listing http:// URLs in an https:// property, or www URLs in a non-www property, causes them to be ignored. Pick one canonical form and stick to it everywhere — see noindex vs disallow for how indexing directives interact with this.
  • Duplicate entries. The same URL listed twice doesn't add coverage; it just inflates your count and looks sloppy to crawlers.

Submitted vs Indexed: The Status That Isn't an Error

The single most common panic is seeing "Submitted: 240, Indexed: 180" and assuming something broke. It usually didn't.

A sitemap is a suggestion, not a command. It tells Google which URLs exist and roughly how often they change — it does not force indexing. Google decides what's worth indexing based on quality, crawl budget, and duplication. A gap between submitted and indexed is normal and often healthy; it means Google is filtering thin, duplicate, or low-value pages, which is what you want.

Only investigate if the gap is large and growing, or if important pages are missing. Then use the URL Inspection tool on a specific missing page to see the actual reason — "Crawled, not indexed," "Duplicate without canonical," or "Excluded by noindex tag" each point to a different fix, and none of them are sitemap problems.

A Clean Sitemap Checklist

Before you submit or resubmit, confirm all of this:

  1. The URL returns raw XML with a 200 OK status.
  2. robots.txt doesn't block the sitemap path, and includes a Sitemap: line.
  3. The <?xml?> declaration is the first line — no BOM, no leading whitespace.
  4. The <urlset> namespace is present and spelled correctly.
  5. Every <loc> is an absolute URL on the exact property domain and protocol.
  6. Special characters in URLs are escaped.
  7. <lastmod> dates use W3C format.
  8. Under 50,000 URLs and 50 MB; if not, use a sitemap index.
  9. No duplicate URLs.

The first six are formatting; the XML Sitemap Validator catches every one of them in a few seconds. For the background on how sitemaps work and when you actually need one, the complete sitemaps guide covers the fundamentals.

The Bottom Line

Most Search Console sitemap errors fall into three buckets: Google can't reach the file (fetch/robots/redirect), Google can't parse the file (XML formatting), or Google read it fine and you're misreading a normal submitted-vs-indexed gap. Diagnose which bucket you're in first — that alone solves half of them.

Validate the file before you resubmit, fix everything in one pass, and stop resubmitting blind. Run your sitemap through the XML Sitemap Validator, pair it with a clean robots.txt, and most of these statuses clear on the next crawl.

Ready to try it?

Validate your XML sitemap for SEO issues. Check for missing URLs, invalid dates, duplicate entries, namespace errors, and file size limits.

📄 XML Sitemap Validator — Free Online Tool

Get notified about new SEO tools

More free tools coming soon — keyword research, sitemap generator, and more.