Two Types of Sitemaps
The term "sitemap" refers to two different things, and understanding the distinction matters. One is built for people, the other for machines. Both serve important purposes.
HTML Sitemap: For Visitors
An HTML sitemap is a regular web page that lists all the pages on your website in a structured, easy-to-read format. Think of it like a table of contents for your entire site. Visitors can browse it to find pages they might not discover through your main navigation.
HTML sitemaps are especially useful for larger websites where not every page is accessible from the main menu. If your website has blog posts, service pages, location pages, and resource sections, an HTML sitemap ties everything together in one place.
XML Sitemap: For Search Engines
An XML sitemap is a machine-readable file that tells search engines like Google and Bing about every page on your site. It is not designed for human visitors to read. Instead, it lives at a URL like yoursite.com/sitemap.xml and is submitted to search engines through tools like Google Search Console.
The XML sitemap includes each page's URL, when it was last updated, how frequently it changes, and its relative priority compared to other pages on the site. This information helps search engines crawl your site more efficiently and discover new or updated content faster.
Why Your Business Website Needs a Sitemap
Better Search Engine Indexing
Search engines discover pages by following links. If a page on your site is not linked from anywhere, or is several clicks deep, search engines may never find it. An XML sitemap ensures every page is discovered and indexed, regardless of your internal linking structure.
Improved User Experience
Visitors who cannot find what they need through your navigation have two choices: use the sitemap or leave. An HTML sitemap provides a clear overview of your entire site and serves as a fallback navigation tool. It is particularly valuable for visitors who think differently about site structure than your navigation menu assumes.
Faster Discovery of New Content
When you publish a new blog post or add a new service page, search engines will not know about it immediately. An updated XML sitemap signals to search engines that new content exists, which can speed up the time it takes for that content to appear in search results.
How to Structure an HTML Sitemap
- Organize by section. Group pages under clear headings that match your site's main sections: Services, About, Blog, Resources, Legal, and so on.
- Use a hierarchical layout. Show parent-child relationships between pages. If your Services section has individual service pages underneath it, nest them visually.
- Keep it current. The sitemap should be updated whenever you add or remove pages. Automated generation is ideal for sites that change frequently.
- Link from the footer. Place a link to your HTML sitemap in your website footer so it is accessible from every page.
- Make it scannable. Use clear page titles, brief descriptions where helpful, and clean formatting. The sitemap should be easy to scan quickly, not a wall of links.
XML Sitemap Best Practices
- Include all important pages. Every page you want search engines to index should be in the sitemap. Exclude pages you do not want indexed, such as thank-you pages, internal-only pages, or duplicate content.
- Keep it under 50,000 URLs. Search engines have a limit of 50,000 URLs per sitemap file. If your site exceeds this, use a sitemap index file that references multiple smaller sitemaps.
- Use accurate last-modified dates. The lastmod tag should reflect when the page content was actually changed, not just the date the sitemap was generated. Inflated dates can cause search engines to ignore the tag entirely.
- Submit to search engines. After creating your XML sitemap, submit it through Google Search Console and Bing Webmaster Tools. This ensures the search engines know it exists and check it regularly.
- Reference it in robots.txt. Add a line to your robots.txt file pointing to your sitemap: Sitemap: https://yoursite.com/sitemap.xml. This helps any crawler find it.
- Automate generation. Most modern content management systems and static site generators can create and update XML sitemaps automatically. Manual maintenance is error-prone and unnecessary for most sites.
Common Sitemap Mistakes
- Including pages that return errors. If a page is in your sitemap but returns a 404 or 500 error, it signals to search engines that your site is poorly maintained. Regularly audit your sitemap for broken URLs.
- Including non-canonical URLs. If the same content exists at multiple URLs (with and without www, with and without trailing slashes), only include the canonical version in your sitemap.
- Including pages blocked by robots.txt. If your robots.txt file blocks a page from being crawled, do not include it in your sitemap. This sends contradictory signals to search engines.
- Never updating the sitemap. A stale sitemap that does not reflect your current site structure is worse than no sitemap at all. It wastes crawl budget and directs search engines to pages that may no longer exist.
- Forgetting images and video. If your site relies heavily on visual content, consider image and video sitemaps that help search engines index your media content for image and video search results.
When Sitemaps Matter Most
While every website benefits from sitemaps, they are especially important in certain situations:
- New websites. A brand-new site has no external links pointing to it. A sitemap helps search engines discover all your pages from day one.
- Large websites. Sites with hundreds or thousands of pages need sitemaps to ensure complete indexing.
- Sites with deep content. If some pages are many clicks away from the home page, they may not be crawled without a sitemap.
- Sites that change frequently. Blogs, news sites, and e-commerce stores with constantly changing inventory benefit from sitemaps that alert search engines to new and updated content.
Continue Learning
Sitemaps are part of your website's overall structure and SEO strategy. Explore these related guides:
- Home Page -- The top of your site hierarchy that the sitemap maps out.
- Error Pages -- Handle broken links that sitemaps can help you identify.
- Legal Page -- Pages that should appear in your sitemap but not necessarily your main navigation.
- Learning Center -- Browse all educational resources.