It goes without saying that finding your business’ website occupying a top spot
on Google is a bit like hitting the jackpot.
According to
Nielsen//NetRatings, 46.2% of all online
searches were made using Google in July 2005. Comparing this with Yahoo, the
second most popular search engine at 22.5%, the profit potential of a high
Google listing becomes that much more apparent.
In this article I’ll explain the basics of Google Sitemaps - a relatively new
Google feature that promises to greatly increase the odds of getting Google to
crawl your website.
While simply being crawled by Google is no guarantee of a high page rank
(many other search optimization factors apply such as
optimizing your pages and
creating inbound links), not being
crawled at all is a 100% guarantee of never being ranked. Google Sitemaps goes
far in clearing this first hurdle.
Google Sitemaps Overview
Traditionally, after submitting your website to Google, Google's crawler -
Googlebot - eventually seeks out and crawls all linked pages on your site. Once
a page has been crawled, Googlebot periodically returns to check for updates.
Shortly after each of Googlebot's visits, Google updates its index and
recalculates the ranking of your pages.
This procedure has proven less than optimal for Google and site owners in that
Googlebot uses an enormous amount of
computing resources crawling the Internet looking for pages that *may*
have changed. These resources and the time Googlebot wastes looking for
potential page changes means delays in crawling new websites and recalculating
the ranking of pages that *have* changed.
Google Sitemaps resolves this problem by offering a method to exchange
information on new and modified content in a timely manner. Both sides benefit:
Google saves computing resources and bandwidth costs while site owners get their
updated content listed sooner.
The Basic Process
The Google Sitemap submission process is remarkably easy and requires no
programming knowledge or additional software. In addition to Google’s own
Sitemap Generator, there are many other free online tools available to build
your Sitemap. I list a few of these a little later. Here’s how the Sitemap
process works:
- Using Google’s or another Sitemap tool, the owner compiles a list of site
URLs and adds a few optional attributes (date of last modification, priority and
change frequency) to each URL entry. To receive Google’s highest priority, this
list must be an XML file and must be named 'sitemap.xml'. This file is uploaded
to the site’s root directory (the same directory as the homepage).
- The site owner then registers with Google at
https://www.google.com/webmasters/Sitemaps
and submits the URL of the Sitemap, e.g., http://www.yoursite.com/sitemap.xml.
- Google now verifies that the individual submitting the Sitemap is indeed the
site owner. This is done by requiring the creation of a new verification file
with a unique name such as GOOGLE13e5849324c7364e.html. This new file is also
uploaded to the root directory - the same directory as sitemap.xml.
- Google then checks for the existence of the verification file and verifies
the validity of the sitemap.xml file. Once accepted, Google schedules a crawl
using the information provided. Shortly after each download of a Sitemap,
Googlebot revisits the web site and fetches new and modified content.
- If site content changes significantly, the site owner generates a new
Sitemap and resubmits it to Google (step 2 above).
Sitemap Tools
In addition to Google’s own Sitemap Generator (Python 2.2 or higher must be
installed on the web server), three of many other online Sitemap generators are:
More Sitemap tools are listed at
http://code.google.com/sm_thirdparty.html
Sitemap Tags
When generating a Sitemap using any tool, it’s important to enter:
- When a page was last changed
- The anticipated periodicity of future page changes
- What priority Googlebot should give that page when crawling your site
This information is communicated to Googlebot by means of XML tags contained
in your sitemap.xml file.
To illustrate this, consider the following sitemap.xml example. Don’t let this
intimidate you; it’s automatically generated by whatever Sitemap tool you use.
<?xml version="1.0" encoding="UTF-8" ?>
<urlset xmlns ="http://www.google.com/schemas/sitemap/0.84" xmlns:xsi
="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation
="http://www.google.com/schemas/sitemap/0.84
http://www.google.com/schemas/sitemap/0.84/sitemap.xsd">
<url>
<loc>http://www.alaskainternettoday.com/index.html</loc>
<lastmod>2005-10-18</lastmod>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc http://www.alaskainternettoday.com/experts/alaskan_experts.html </loc>
<lastmod>2005-10-18</lastmod>
<changefreq>weekly</changefreq>
<priority>0.9</priority>
</url>
</urlset>
Last Modified or <lastmod>: <lastmod> tells Googlebot when a page was
last modified or created and enables Googlebot to only consider fresh content
and not waste time crawling old content. Changes in this tag should trigger a
Sitemap resubmission.
Change Frequency or <changefreq>:
<changefreq> denotes how often you expect the page in question to change. In
many cases this is an educated guess but do give it your best. Valid entries
are: "always", "hourly", "daily", "weekly", "monthly", "yearly" and "never".
Priority or <priority>: Assign a reasonable
<priority> from 0.0 (the lowest) to 1.0 (the highest) to all your pages.
Googlebot will crawl high-priority before low-priority pages. A high priority
should be given to pages that change frequently and are of greater interest for
your site visitors. Assign low a low priority to static pages such as contact
forms, archives, etc.
More Information
For more information on Google Sitemaps, see the official Google Sitemap guide
at
http://www.google.com/webmasters/sitemaps/
The Bottom Line
While obtaining a high Google ranking for your website depends on many factors,
none of these matter if Google doesn’t know your site exists. Google Sitemaps is
designed to allow Google to discover your site amongst the estimated more than 8
billions other pages out there.
This is one search engine optimization tool that is long overdue and one that
every online business owner can profit from.
More Alaskan Experts >>