Request a demo
author
Sudhir Sharma
M Posted 9 years ago
t 4 min read

You’ve likely heard of XML Sitemap generation, but do you know what it is, and why it is so critical that you have one? Simply put, an XML Sitemap is a way to list your website’s pages in a file using XML tags (XML stands for “extensible markup language” schema, which is far more precise than HTML aka “hyper text markup language” code). Once submitted to Google (and other search engines), this list of your site’s individual page URLs informs them about the organization of your site’s content. This, in turn, enables “Googlebot” (and other search engine “spiders”) to crawl your website more accurately and index its content in the search engine results pages (SERPs) more quickly.

Google introduced Sitemaps in 2005, followed a year later by Yahoo and Microsoft (now known as Bing) in a rare collaborative initiative known as the “Sitemaps Protocol." The good news for webmasters is that, as a uniform standard supported by the largest (U.S.) search engines, you only need to work on one XML Sitemap generation. In this introductory guide, we’ll discuss best practices and share resources for optimizing your website’s search performance with XML Sitemap generation.

Creating your sitemap

As Web crawlers work by following links, Google Webmaster Tools (GWT) guidelines recommends the creation of Sitemaps if your site:

  • Is really large or relatively new
  • Contains pages that are not linked well within your site or are otherwise “isolated”
  • Uses rich content (video and/or images)
  • Contains content featured in Google News

Even if your website doesn’t meet any of these criteria, it is still a best website optimization practice to build and submit a Sitemap to ensure that all of your site’s pages are crawled and indexed in the SERPs as quickly and accurately as possible which also helps create a good bounce rate. Both GWT and the Sitemaps Protocol (referenced above) offer specific examples of how to apply XML tags to your website’s pages.

Rich media XML sitemaps

As with written content, building Sitemaps for your site’s videos and images ensures they are found and indexed by Web crawlers, which can improve visibility for your site’s rich media in Google’s image and video search results. For both types of rich media content, you have the option of creating a separate Sitemap or adding the information to your existing website Sitemap using the appropriate XML tag “extensions,” according to GWT. Details on how to use XML tag extensions for images are available via GWT here, and for videos, here.

As a side note, another way for ensuring your images are visible in search is by following Google’s image publishing guidelines, which urges webmasters to use highly descriptive file names for images to assist Google in identifying their subject matter. And you can also check out Google’s video best practices that include details on how to use schema.org to mark up video content within Web pages, as well as how to create quality “thumbnail images” to be displayed alongside your video pages in its search results. Google also addresses how to create XML tags for mobile. You can find that here.

XML sitemap troubleshooting

There are several issues that commonly occur with XML Sitemaps. GWT (and others) note that one of the most typical snags occurs with contradicting website URLs. This occurs when a webmaster submits a Sitemap for a website recognized as “http://www.mysite.com” in its search engine’s webmaster tools account, but submits an XML file with page URLs that do not include the “www” preface (i.e., “http://mysite.com”). Besides this obvious oversight, other typical issues involve broken links that haven’t been addressed and “301” redirected pages that haven’t been updated to reflect the new website page URL, as Ben Goodsell explains at Search Engine Watch. Fortunately, there are also several tools for troubleshooting XML Sitemap issues.

GWT’s “Crawl Errors” page offers information on how to detect both site-level and page-level errors. Other resources include “Fetch as Googlebot,” which allows you to see content and links that Googlebot can’t crawl. If you have multiple sites to manage, GWT offers detailed tips for simplifying XML optimization. Well, there you have it: a basic introduction to XML Sitemaps, which are a best practice for every brand that owns a website and wants to ensure its content is found in the search results and even in the Google answer box!