XML Sitemaps: What They Are, Why You Need One, and How to Build It Right
Search engines are smart, but they're not psychic. If your site has pages that aren't linked well internally — new product pages, blog posts buried three clicks deep, landing pages with no nav links — Google might never find them. That's where XML sitemaps come in.
An XML sitemap is a file that lists every URL on your site you want search engines to know about. Think of it as a table of contents you hand directly to Google, Bing, and other crawlers.
Do You Actually Need a Sitemap?
Google's official stance: sitemaps are "helpful but not required." But here's when they go from helpful to essential:
- New sites with few backlinks. Googlebot discovers pages by following links. No external links pointing to your site? It has nothing to follow.
- Sites with 500+ pages. The bigger your site, the more likely some pages get missed during crawling.
- Pages that change frequently. Sitemaps include a
<lastmod>tag that tells crawlers "hey, this page updated — come back." - Orphan pages. Any URL not linked from your navigation or other pages is invisible to crawlers without a sitemap.
- Sites using JavaScript rendering. If your content loads via JS frameworks (React, Vue, Angular), crawlers sometimes struggle. A sitemap ensures they at least know the URLs exist.
If your site is under 50 pages and everything is linked from your nav — you can probably skip it. Everyone else should have one.
What Goes Inside an XML Sitemap
Here's what a basic sitemap looks like:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2026-03-09</lastmod>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://example.com/about</loc>
<lastmod>2026-02-15</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
Each <url> entry has four possible fields:
| Field | Required? | What It Does |
|-------|-----------|-------------|
| <loc> | Yes | The full URL of the page |
| <lastmod> | No (but use it) | When the page was last meaningfully updated |
| <changefreq> | No | How often the page typically changes (daily, weekly, monthly) |
| <priority> | No | Relative importance within your site (0.0 to 1.0) |
Real talk on <changefreq> and <priority>: Google has publicly said they mostly ignore these two fields. They're hints at best. <lastmod> is the one that actually influences crawl behavior — keep it accurate.
The Mistakes That Waste Your Sitemap
1. Including noindex pages
If a page has a <meta name="robots" content="noindex"> tag, don't put it in your sitemap. You're telling Google "index this" and "don't index this" at the same time. Google will figure it out eventually, but you're sending mixed signals and wasting crawl budget.
2. Stale <lastmod> dates
Some CMS setups auto-update <lastmod> every time the site rebuilds, even if the page content didn't change. Google notices. If your lastmod dates change but the content doesn't, Google starts ignoring your lastmod entirely — which defeats the purpose.
Only update <lastmod> when the page content actually changes.
3. Listing non-canonical URLs
If /products/widget and /products/widget?ref=email both appear in your sitemap, you're splitting signals. Only include the canonical version of each URL. Same goes for www vs non-www and http vs https — pick one and stick with it.
4. Exceeding size limits
A single sitemap file can hold a maximum of 50,000 URLs and must be under 50MB uncompressed. If you have more URLs than that, you need a sitemap index file that points to multiple sitemap files:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://example.com/sitemap-pages.xml</loc>
<lastmod>2026-03-09</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-blog.xml</loc>
<lastmod>2026-03-08</lastmod>
</sitemap>
</sitemapindex>
5. Forgetting to submit it
Creating the file isn't enough. You need to tell search engines where to find it:
- Google Search Console: Go to Sitemaps → Add your sitemap URL
- Bing Webmaster Tools: Same process under Sitemaps
- robots.txt: Add
Sitemap: https://example.com/sitemap.xmlat the bottom of your robots.txt file
That robots.txt line is the most important one. Every major crawler checks robots.txt first, so they'll discover your sitemap automatically.
How to Build Your Sitemap
You've got a few options depending on your setup:
Static sites or manual control: Use our Sitemap Generator to paste in your URLs, set priorities and change frequencies, and download a ready-to-upload XML file. No signup, no install.
WordPress: Yoast SEO and Rank Math both generate sitemaps automatically. Check yoursite.com/sitemap_index.xml — it's probably already there.
Next.js / Astro / other frameworks: Most modern frameworks have sitemap plugins. Next.js has next-sitemap, Astro has @astrojs/sitemap. These generate the file at build time from your routes.
Large e-commerce sites: You'll likely need a sitemap index with separate files for products, categories, and content pages. Most platforms (Shopify, WooCommerce, Magento) handle this out of the box, but audit the output — auto-generated sitemaps often include filtered/sorted URLs that shouldn't be indexed.
After You Submit: What to Watch
Once your sitemap is live and submitted to Google Search Console, check the Sitemaps report after a few days. You'll see:
- Discovered URLs: How many URLs Google found in your sitemap
- Indexed URLs: How many actually made it into the index
- Errors: Any URLs that returned 404s, redirects, or other issues
A big gap between discovered and indexed usually means content quality issues, not sitemap issues. Google found the pages — it just didn't think they were worth indexing. That's a content problem, not a technical one.
Sitemap + Other Technical SEO
A sitemap works best alongside other technical SEO fundamentals:
- Robots.txt controls what crawlers can access. Your sitemap tells them what to prioritize.
- Meta tags on each page tell Google how to display your pages in search results once they're indexed.
- Schema markup adds structured data that earns rich snippets — FAQ dropdowns, star ratings, product prices right in search results.
- Open Graph tags handle the social media side — how your pages look when shared on Facebook, Twitter, and LinkedIn.
The sitemap gets your pages found. Everything else determines how well they perform once they're in the index.
Quick Checklist
- [ ] Generate your sitemap with only canonical, indexable URLs
- [ ] Set accurate
<lastmod>dates (not auto-updated timestamps) - [ ] Keep each sitemap under 50,000 URLs / 50MB
- [ ] Add
Sitemap:directive to your robots.txt - [ ] Submit to Google Search Console and Bing Webmaster Tools
- [ ] Audit the Sitemaps report monthly for errors
- [ ] Regenerate after adding or removing significant pages
Need to build one right now? The Sitemap Generator creates a valid XML sitemap in under a minute. Paste your URLs, configure your settings, copy the output.
Ready to try it?
Generate XML sitemaps from a list of URLs. Set change frequency, priority, and last modified dates for every page on your site.
🗺️ Sitemap Generator — Free Online ToolGet notified about new SEO tools
More free tools coming soon — keyword research, sitemap generator, and more.