What is a Sitemap?
On short (aka ELI5): It’s an XML formatted document that lists all the accessible URLs on your website – at least those you care to have crawled – and some metadata to help it be crawled more effectively.
A Sitemap is an XML file that lists the URLs for a site. It allows webmasters to include additional information about each URL: when it was last updated, how often it changes, and how important it is in relation to other URLs in the site. This allows search engines to crawl the site more intelligently.
Sitemaps are a convenient way for you to tell search engines about all the URLs on your website, including URLs for pages that are difficult to find by following links.
A Sitemap index file is a file that points to other Sitemap files. You can use a Sitemap index file to submit multiple Sitemaps to Search Console (for different language versions of the site, for example).
Simple Sitemap Example:
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.yoursite.com/sitemap.xml"> <url> <loc>http://www.yoursite.com/</loc> <lastmod>2023-01-01</lastmod> <changefreq>monthly</changefreq> <priority>0.8</priority> </url> </urlset>
If you have a large site, or one with a lot of rich media content, you can break your Sitemap into multiple Sitemaps. You can then list each Sitemap in a Sitemap index file (a file that points to a list of Sitemaps) and submit the Sitemap index file to Google instead of each individual Sitemap. This way, you only need to submit one file to Google.
The following best practices will help you create and submit a Sitemap that optimizes your site’s crawl coverage:
Create a Sitemap for each subdomain, and submit each Sitemap separately. For example, if your site has the following subdomains:
Then you should create a Sitemap for each subdomain and submit all three Sitemaps. This helps Google to more intelligently crawl your site by understanding the structure of your subdomains.
List only the URLs that you want Google to crawl. It’s a good idea to test your Sitemap with a few different browsers to make sure that it is well-formed and that the URLs listed in it are accessible.
If you want Google to ignore certain URLs, such as those that are generated by a search engine, you can add a <robots.txt> file to your site.
Include the <lastmod> tag in your Sitemap files. The <lastmod> tag tells Google when a page was last modified, which can be useful information for Google when it is determining how often to crawl a page. If you don't specify a <lastmod> date, Google will recrawl the page when it finds an external link to that page. If you update your pages frequently, you can save Google's crawlers time by specifying an accurate <lastmod> date. You can use any valid ISO 8601 format for the <lastmod> tag, for example: YYYY-MM-DD YYYY-MM-DDThh:mm:ss+-hh:mm Include the <changefreq> tag in your Sitemap files. The <changefreq> tag tells Google how often you update your pages. Google uses this information when it is determining how often to crawl a page. You can specify one of the following values for the <changefreq> tag: Be an XML file. Be UTF-8 encoded. Contain one <sitemap> element for each Sitemap. As I said above, the <sitemap> element must contain the following child elements: <loc>: The URL of the Sitemap. <lastmod>: The date of the last modification of the Sitemap. The date must be in W3C Datetime format. This format allows you to omit the time element if desired. <changefreq>: How frequently the pages in the Sitemap are likely to change. This value provides general information to search engines and may not correlate exactly to how often they crawl the page. Valid values are: always hourly daily weekly monthly yearly never <priority>: The priority of this URL relative to other URLs on your site. Valid values range from 0.0 to 1.0. This value does not affect how your pages are compared to pages on other sites—it only lets the search engines know which pages you deem most important for the crawler to visit. The <sitemapindex> element is the root element of a Sitemap index file, and must contain one or more <sitemap> elements.
Sitemaps are particularly helpful if:
- Your site has dynamic content that can’t be discovered by Google’s normal crawling process.
- Your site has pages that aren’t easily discovered by Google’s crawler, such as pages behind a login or restricted by country.
- Your site is new and has few external links pointing to it
Sitemaps supplement the existing crawl-based discovery mechanisms and can be used to provide Google and other search engines with information about specific types of content on your site, such as:
- New content (such as job postings, news articles, or blog posts);
- Changes to content that happens frequently or sporadically (such as events or product launches);
How To Make a Sitemap For Website?
If you decide to use a plugin, there are a few options available. One popular option is the XML Sitemaps plugin. This plugin will automatically generate a sitemap for your website and keep it up to date as you add new content.
Another option is to use a tool like the XML Sitemap Generator for Google plugin. This plugin has a feature that allows you to create a sitemap. Once you create the sitemap, you can then submit it to search engines like Google so that they can index all the pages on your site.
If you don’t want to use a plugin (good choice!), you can also manually create a sitemap.xml file. This file contains all the information that a search engine needs to index your website. For this, you can use this website.
Once you have created your sitemap, be sure to submit it to all the major search engines so that they can index your site.