What Is An XML Sitemap And Why Do You Need One?
A great XML sitemap acts as a map for your website, leading Google to all of its important pages. Moreover, despite the fact that a large number of people have submitted responses in forums, given advice on blogs, and magnified remarks on social media; separating helpful information from misinformation takes patience. This article explains what sitemaps are and how they can help your website rank higher in search results. So, let’s clarify the air about best sitemap practices right now.
What exactly is an XML Sitemap?
An XML sitemap is a text file that contains a list of pages on your website that you want Google and other search engines to index. You can consider them digital maps that assist search engines in locating important content on your website. They also tell search engines which pages you think are important, how frequently you change them, and when they were last updated.
All of this makes it easier for search engines to crawl your site. This means whenever a new page is added or an old one is removed, these changes are directly fed to the search engines via sitemaps. And the entire process of indexing your pages is sped up as a result of this.
How to decide if you need a sitemap or not?
As per Google documentation, sitemaps are especially useful for really large websites. What exactly are “really large websites”? If your website contains more than 500 pages it can be regarded as a really large website. For a rather small and simple website, it’s not necessarily a requirement to include a sitemap and eventually comes down to your personal preference.
If you believe you have a really good and clear link structure, and you can set up some custom segments in Google Analytics to track different portions of your site, that can work for you. For example, if any pages are not indexed, you will easily know about it and can immediately take action, as well as find out why this is happening.
You can use a sitemap to show Google exactly where to discover the key and essential material on your site. You can also tell Google which pages have been recently changed so that it can focus on new information on your website. Furthermore, if your website is relatively new and there aren’t many links leading to it, a sitemap will be quite helpful in directing Google to it.
Additionally, this guide to White Hat SEO should give you enough knowledge on the ethical way to go about things.
Which pages should your sitemap include?
Many people make the error of adding all the indexable pages in the sitemap. Being indexable isn’t the only requirement. Your pages should have high-quality material that is useful to users. You can include several media kinds in your sitemaps, such as videos and photos.
In an ideal world, for example, you would have a separate sitemap for your photos. It’s also a terrific way to tell Google where to find and discover your photographs.
Sitemaps in Google Search Console.
If you want Google to find your sitemap faster, you must upload it to your Google Search Console account. You can immediately see if your XML sitemap has been added in the ‘Sitemaps’ section. If not, you can include it at the top of the page. You can check if Google has indexed all of the pages in your sitemap when you add it.
If there is a significant difference between the ‘submitted’ and ‘indexed’ data on a specific sitemap, it is advised that you investigate further. A technical error may be preventing specific pages from being indexed. Another scenario is that you’ll need more links to the unindexed material.
Although search engines can reach your URLs without it, putting sites in an XML sitemap indicates that you consider them to be high-quality pages. Furthermore, while there is no assurance that a sitemap will get your pages crawled, let alone indexed or ranked, submitting one enhances your chances considerably.
What does an XML Sitemap look like?
Since an XML Sitemap is intended for search engines, it is formatted in a computer-friendly language: XML
Let’s check out the individual parts to take into consideration.
<?xml version=”1.0″ encoding=”UTF-8″?>
This header indicates that the contents are structured in accordance with the XML standard’s version 1.0 and describes the character encoding. It essentially tells search engines what to expect from the file.
Definition of the URL set
This URL set definition contains all of the URLs in the sitemap and specifies which version of the XML Sitemap standard is used. It’s worth noting that the URL set is closed at the bottom of the document:
Definition of the individual URLs
The loc-tag must be included in every URL definition (short for location). The value of this tag should be the page’s full URL, including the protocol (for example, “http://”).
lastmod: the date on which the URL’s content was last modified. The date is formatted in “W3C datetime.”
priority: On a scale of 0.0 to 1.0, the priority of the URL in relation to your website.
changefreq: how frequently the URL’s content is likely to change. Always, hourly, daily, weekly, monthly, yearly, and never are examples of possible values.
Types of Sitemaps
Sitemaps come in a variety of types. Let’s check out the ones you need.
XML Sitemap Index
The first thing you should realize is that XML based sitemaps have a few limitations:
- A maximum of 50,000 URLs are permitted.
- An uncompressed file can only be 50 MB in size.
Although you can compress sitemaps with gzip (the file name will become something like sitemap.xml.gz) to save bandwidth on your server. It cannot exceed the 50 MB or 50,000 URL limit once unzipped. If you happen to go above the limit, you’ll have to split your URLs among many sitemaps.
The sitemaps that arise can finally be integrated into a single XML sitemap index file (something like sitemap-index.xml). Making it similar to a sitemap for sitemaps. In the case of an exceptionally large website/s, you can make multiple XML sitemap index files.
Another point to remember is that you can’t next sitemap index files.
To make it easier for search engines to find all of your sitemap files at once, you should:
- Upload your sitemap index(es) to Google Search Console and Bing Webmaster Tools.
- Include your sitemap index URL(s) in your robots.txt file. Directing search engines to your sitemap as you invite them to crawl.
You can also send sitemaps to Google by pinging them.
XML Image Sitemap
XML Image sitemaps were created to help with the indexing of image-based material. However, the most recent SEO upgrades have enabled pictures to be incorporated within page content. And will be crawled alongside the page URL. JSON-LD schema.org/ImageObject markup is regarded as the best technique for calling out image properties to search engines because it offers more features than an XML image sitemap.
For the reason stated above, most websites do not require an XML image sitemap. Incorporating it would only squander your crawl budget. The only exception to this would be if the photos help you promote your business, as they do for a stock photo website. Images in a sitemap do not have to be on the same domain as your website. You can utilize a CDN if it is validated in Search.
XML Video Sitemap
Just like in the previous scenario, submitting an XML video sitemap is meaningless if the videos are not important to the operation of your organization. Minimize the crawl budget for the page where the video is embedded, and make sure that all videos are JSON-LD designated as a schema.org/VideoObject.
Dynamic XML Sitemap
Static sitemaps are manually edited; for example, if a page is added or removed from the site, the sitemap must also be manually edited. If your site has static content, this is the best option for you. That is if your content does not change frequently. However, with static sitemaps, things can quickly spiral out of control.
For example, if you forget to update your sitemap and your website at the same time, you may end up with a slew of URLs that are useful but aren’t listed in your sitemap. Static sitemaps are easy to create with a tool like Screaming Frog. The issue is that your sitemap becomes out of date as soon as you add or remove a page.
If you make changes to a page’s content, the sitemap will not automatically update the lastmod tag. Static sitemaps should be avoided if you want to avoid manually creating and uploading sitemaps for every change made. In contrast, a dynamic XML sitemap updates automatically when you add or remove a page from your site.
And this is a better solution if your site has dynamic content, such as an e-commerce site that constantly adds new products or a news website. Many plugins are available to help you create a dynamic sitemap; they are always up to date, and you don’t need to worry about any technical details.
To make a dynamic XML sitemap, follow these steps:
- Request that your developer creates a custom script, making sure to provide detailed instructions.
- Make use of a dynamic XML sitemap generator.
- Install a CMS plugin, such as the Yoast SEO plugin for WordPress.
An HTML sitemap is a file that contains all of your website’s significant pages that you want search engines like Google and Bing to index. Search engine requirements are met through XML-based sitemaps. HTML sitemaps were created to help human users discover the material.
The issue therefore becomes, do you need an HTML sitemap if you have a decent user experience and well-crafted internal links? Look at the page visits for your HTML sitemap in Google Analytics. Chances are, it’s quite a small number. If not, it’s a solid sign that your website navigation needs to be improved.
HTML sitemaps are commonly seen in website footers. Obtaining link equity from each and every page of your website. The question that arises is this: If just a few people utilize it. Furthermore, search engines do not require it because you have strong internal linking and an XML sitemap. Is there a purpose for that HTML sitemap? Most experts would say not.
Google News Sitemap
A Google News sitemap sends Google the metadata associated with specific news articles on a website. The owner of the website will be able to control which specific content is sent to Google News by using this sitemap. This sitemap should only be used by sites that are registered with Google News.
Include articles published in the last two days, up to a limit of 1,000 URLs per sitemap, and update with new articles as they become available. Additionally, contrary to popular belief, Google News sitemaps do not accept image URLs.
A mobile sitemap is not required for responsive websites that display or load content based on browser capabilities. Why? Because mobile sitemaps are only for feature phone pages. Not for smartphone compatibility. So, unless you have unique URLs designed specifically for feature phones, a mobile sitemap will be useless.
- Whereas using a sitemap index and a dynamic XML sitemap is the best way to go about it today. The use of HTML and mobile sitemaps is largely ineffective.
- Utilization of Google news, image, and video sitemaps are only useful if these content types are the drivers of your KPIs; only then will improved indexation of these content types be beneficial.
Where to place your Sitemap
An XML Sitemap, like your website’s pages, has its own URL. The URL for an XML Sitemap is normally sitemap.xml, and it is advised that this pattern be followed in order for search engines to easily identify it. Google recommends placing your sitemap in the root directory so that it can affect all files, and don’t forget to include a reference to it in your robots.txt file.
How to create a sitemap and let Google know about it?
There are a couple of workarounds. If you use a CMS, for example, many of them will generate sitemaps for you. You can also use one of the many third-party tools available to create a sitemap for yourself, such as Screaming Frog or xml-sitemaps.com.
Furthermore, you can also create your sitemap manually, which allows you to customize it. But I’d recommend it for very small websites because it would be very time-consuming and difficult to control if you have a very large website with thousands of URLs.
Here is an SEO checklist to help you through your SEO strategy.
Sitemap Best Practices
- Keeping an up-to-date sitemap
Confirm that your sitemap gives an accurate representation of your website. When a page is removed from your sitemap, it should also be delisted. If you use the optional lastmod-tag, remember to update the timestamp whenever the page changes.
- A sitemap should only contain indexable pages.
Only indexable pages should be described in your sitemaps. This means you should exclude any URLs pointing to redirects (e.g., 301 status code) and missing pages (e.g. 404 status code).
- In your robots.txt file, include the XML Sitemaps.
Make a note in your robots.txt file if you deviate from the convention for the URL of your sitemap or sitemap index. Even if you use the standard URL, it’s a good idea to include a reference to it in your robots.txt to guarantee that search engines can locate it.
Q1. What is the .gz extension?
When XML-based Sitemaps are compressed, the.gz extension is appended to the filename (via gzip compression). Sitemaps with a large number of URLs typically have large file sizes, which can be lowered by using compression to lessen the impact on disc storage and network transfer time.
Q2. Where should you place an XML sitemap index?
The XML Sitemap Index follows the same format as basic XML-based sitemaps in terms of placement and filename: /sitemap index.xml. However, you are welcome to deviate from this as long as you include a reference to it in your robots.txt file through the Sitemap-directive.
Q3. How important are lastmod, priority, and changefreq?
There’s no need to get too caught up with lastmod, priority, or changefreq. Although the lastmod, priority, and changefreq properties can be specified for each URL, they are completely optional. It won’t harm to define them, and there’s a chance search engines will use this information, but it’s often assumed that search engines don’t pay much attention to them.
An XML Sitemap assists search engines in evaluating your website’s content and serves as a way for notifying them of new or updated content. As a result, it is advised that it be implemented whenever possible. And, especially for larger websites (500+ pages), a sitemap can become an absolute necessity.