Understanding and Creating an XML Sitemap
Sitemaps are a big part of the SEO game. The main reason for generating a Sitemap (Sitemap with a capital “S” denotes an XML Sitemap) is to help ensure all your web pages are listed in the search engine indexes. It is incredibly important to have search engines indexing your page to show when a keyword is searched. Major search engines like Google and Yahoo! can drive a lot of traffic to your site simply because you added a valid sitemap to your site. A Sitemap file is simply an XML file that contains a list of pages on your site. Here are the rules to follow when creating a Sitemap.
- The file must use the UTF-8 encoding.
- The data values must be entity-escaped.
- The file location must be the root of the URL’s being submitted.
- A minimum of 3 tags are required: “urlset”, “url”, and “loc”.
- The file location determines what URL’s can be included in the Sitemap. If you place the Sitemap at “http://www.yoursite.com/blog”, the only URL’s that can be submitted have to reside in the root URL “http://www.yoursite.com/blog”. URL’s from “http://www.yoursite.com” can’t be submitted in the Sitemap. The URL’s can also be only from a single host. If you place the Sitemap at “http://www.yoursite.com”, you can’t submit URL’s from “http://dev.yoursite.com”.
- The “urlset” tag is the root tag of the Sitemap. All other tags will be within the open “urlset” tag (<urlset>) and the close “urlset” tag (</urlset>). Within the open “urlset” tag you must include the schema to be used; in most cases it will be “http://www.sitemaps.org/schemas/sitemap/0.9”.
- The “url” tag is the parent tag for each URL you want to to be included in the Sitemap. All tags available, with the exception of the aforementioned “urlset” tag, are child tags of the “url” tag. Available children tags include “loc”, “lastmod”, “changefreq”, and “priority”.
- The “loc” tag is the last required tag. This tag gives the exact URL of the page that is being referenced. The URL entered here needs to be the full URL including “http://”.
- The “lastmod” tag can be included to show when the page was last updated. This date needs to be in “YYYY-MM-DD” format.
- The “changefreq” tag allows you to show how often the page is changed. There are 7 possible values for “changefreg”; they are “always”, “hourly”, “daily”, “weekly”, “monthly”, “yearly”, and “never”.
- The “priority” tag is the level of importance each page is in relation to the others (on your site). This tag does not have anything to do with other website pages. The values can range from 0.0 to 1.0, with 0.5 being the default value.
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>http://www.yoursite.com/</loc> <lastmod>2008-07-07</lastmod> <changefreq>daily</changefreq> <priority>1.0</priority> </url> <url> <loc>http://www.yoursite.com/about</loc> <lastmod>2008-06-09</lastmod> <changefreq>monthly</changefreq> <priority>0.2</priority> </url> </urlset>