What Is a Sitemap?
A sitemap is a blueprint of your website that help search engines find, crawl and index all of your website’s content. Sitemaps also tell search engines which pages on your site are most important.
There are four main types of sitemaps:
- Normal XML Sitemap: This by far the most common type of sitemap. It’s usually in the form of an XML Sitemap that links to different pages on your website.
- Video Sitemap: Used specifically to help Google understand video content on your page.
- News Sitemap: Helps Google find content on sites that are approved for Google News.
- Image Sitemap: Helps Google find all of the images hosted on your site.
Why are Sitemaps Important?
Search engines like Google, Yahoo and Bing use your sitemap to find different pages on your site.
“If your site’s pages are properly linked, our web crawlers can usually discover most of your site.”
In other words: you probably don’t NEED a sitemap. But it definitely won’t hurt your SEO efforts. So it makes sense to use them.
There are also a few special cases where a sitemap really comes in handy.
For example, Google largely finds webpages through links. And if your site is brand new and only has a handful external backlinks, then a sitemap is HUGE for helping Google find pages on your site.
Or maybe you run an ecommerce site with 5 million pages. Unless you internal link PERFECTLY and have a ton of external links, Google’s going to have a tough time finding all of those pages. That’s where sitemaps come in.
With that, here’s how to setup a sitemap…and optimize it for SEO.
Create a Sitemap
Your first step is to create a sitemap.
If you use WordPress, you can get a sitemap made for you with the Yoast SEO plugin.
The main benefit of using Yoast to make your XML sitemap is that it updates automatically (dynamic sitemap).
So whenever you add a new page to your site (whether it’s a blog post or ecommerce product page), a link to that page will be added to your sitemap file automatically:
If you don’t use Yoast, there are lots of other plugins available for WordPress (like Google XML Sitemaps) that you can use to create a sitemap:
What if you don’t use WordPress?
No worries. You can use a third-party sitemap generator tool like XML-Sitemaps.com. These will spit out an XML file that you can use as your sitemap.
Either way, once your sitemap is created, I recommend manually taking a look at it.
(Your sitemap is usually found at site.com/sitemap.xml. But it depends on your CMS and what program you used to create your sitemap)
It should display all of the pages on your site:
If everything looks good, it’s time to submit your sitemap to Google.
Submit Your Sitemap To Google
To submit your sitemap login to your Google Search Console account.
Then, go to “Index” → “Sitemaps” in the sidebar.
If you already submitted your sitemap, you’ll see a list of “Submitted Sitemaps” on this page:
Either way, to submit your sitemap, enter your sitemap’s URL into this field:
And hit “Submit”.
And if everything is all setup, you’ll start to see information on your sitemap on this page under the “Submitted Sitemaps” section:
Use the Sitemap Report to Spot Errors
Once Google has crawled your sitemap, click on it under “Submitted Sitemaps”:
If you see “Sitemap index processed successfully”, then Google successfully crawled your sitemap.
You can also click on the little bar chart icon to go to the Coverage Report for your sitemap:
This report shows you how many URLs Google found in your sitemap… and how many of those pages ended up in Google’s index:
For example, you can see that my sitemap contains links to 116 webpages. 109 are “valid” and 6 are “Excluded”.
I can obviously ignore the valid pages.
But I do want to check out any “Excluded” pages to see what’s up.
It turns out that those 6 URLs in my sitemap are getting a “Duplicate, submitted URL not selected as canonical” message.
And when I look at the URLs, I see that these are pages that I don’t even want indexed in the first place.
So I should remove them from my sitemap.
Use Your Sitemap to Find Problems With Indexing
One of the cool things about using a sitemap is that it can gives you a ballpark estimate of:
- How many pages you WANT indexed
- How many pages ARE indexed
For example, let’s say that your sitemap links to 5,000 pages.
But when you look at the Google Search Console, your site only has 2,000 pages indexed.
That’s a sign that something’s up. It could be that there’s a lot of duplicate content in those 5,000 pages. So Google isn’t indexing all of them.
Or it could be that the number of pages on your site exceed your crawl budget.
Match Your Sitemaps and Robots.txt
It’s important that your sitemaps and Robots.txt work together.
In other words:
If you clock a page in robots.txt or use the “noindex” tag on a page, you DON’T want it to appear in your sitemap.
Otherwise, you’re sending mixed messages to Google.
Your sitemap says: “This page is important enough to make it into our sitemap”. But when Googlebot lands on the page, they get blocked.
Sitemap Pro Tips
Huge Site? Break Things Up Into Smaller Sitemaps: Sitemaps have a limit of 50k URLs. So if you run a site with a ton of pages, Google recommends breaking up your sitemap into several smaller sitemaps.
Be Careful With Dates: URLs in your sitemap have a “last modified” date associated with them.
I recommend changing these dates ONLY when you make significant changes to your site (or add new content to your site). Otherwise, Google warns that updating dates on pages that haven’t changed can be seen as a spammy tactic.
Don’t Sweat Video Sitemaps: Video Schema has largely replaced the need for video sitemaps. A video sitemap definitely won’t hurt your page’s ability to get a video rich snippet. But it’s usually not worth the hassle.
Stay Under 50MB: Google and Bing both allow sitemaps that are up to 50MB. So as long as you’re under 50MB, you’re good.
HTML Sitemaps: This is basically the equivalent of an XML sitemap… but for users.
You don’t necessarily need these as Google and other search engines now rely on your XML sitemap. But if you think they’re useful for human visitors, an HTML sitemap probably isn’t going to hurt your SEO efforts.
Tips for Optimizing Sitemaps
- Use XML Files to Structure Internal Links and External URLs
The XML file is a list of URLs directing crawling bots to the content, and the pathway on a website. Consequently, using internal and external links for your sitemaps informs web crawlers what’s considered important on the website, and helps reduce the occurrence of orphan pages. Such clarity boosts overall SEO health, which augurs well for ranking!
XML sitemaps don’t guarantee the indexing of your web pages but rather boost indexability chances.
- Keeping the Root Directory Clean and Organized
The root directory stores other folders and files on a domain, i.e, it’s the central location for all files and directories forming a website. All web requests start at the root directory.
Hypothetically, including your sitemaps outside the root directory is harmless but this goes against the established protocol. The location of a sitemap determines the files it can accommodate. Methinks, search engines don’t care much when the sitemap.xml is not located in the root directory.
Avoid clogging your root directory with multiple files, as this affects the responsiveness of your website.
- Include ALL Web Pages in the Sitemaps Page URL
As mentioned, sitemaps act as a pathway for Google bots; taking them to all web pages on the site, even when the internal linking isn’t great. Including all webpages on the sitemaps file enhances communication between the website and the search engines.
Tools to Easily Create Sitemap
If you need to generate a sitemap faster, here’s a summary of the best and most convenient tools to consider:
- Google Search Console Tools,
- Bing Webmaster Tools
- Paid online tools such as Yoast
- Pulling sitemaps from websites you don’t own.
10 Things to Exclude on Your Sitemaps
As a best practice, aim to include only the SEO-relevant pages in the sitemap. It’s a recommended method of effectively utilizing the crawl budget.
With this approach, the search engines crawl your website intelligently helping you reap rewards for better indexation.
Aim to exclude:
- Duplicate pages
- Paginated pages
- Non-canonical pages
- Archive pages
- Redirected pages (3xx), Missing pages (4xx) and Error pages (5xx)
- Comment URLs
- No-index pages
- Resource pages useful to site visitors but don’t serve as landing pages
- Site result search pages
- Shared via email pages
How do I find the root directory in WordPress?
For WordPress sites, the /html serves as the root directory for your files. To access the root directory, you can use SSH, STFP, or the File Manager.
Does a sitemap affect SEO?
Yes. Sitemaps list all the priority pages on a website to guide search engines on crawling and indexability. This boosts the rankings of a website making it visible to a large number of internet users, thus complimenting SEO efforts.
Build and submit a sitemap: A guide from Google on creating sitemaps… and getting them submitted to Google.
Using Sitemaps to help Google find content hosted on your site: Quick video from the Google Webmaster YouTube channel on how sitemaps can help your site appear higher and more often in the search results.