by Cory Rauch 2007-02-06 Category: Web-Blogging
Google has introduced a new feature called "Sitemap" that allows a webmaster to not only submit a site for indexing but also to include some extra bits of info that help Google better index your site. These extra bits are frequency of page change, priority of page in relation to rest of the website (Nothing to do with page ranking), and last modification date. With these extra bits Google can schedule the indexing of your site to be more current with how your website changes. For example some pages never changes or change rarely (An example might be your contact page). Other pages like your front page or a company news section may change often sometimes even daily. With the sitemap feature you can now specify to Google how your site changes helping Google to keep current with your site, and more relevant in the searches (A win-win situation here for everyone).
Scripts..Scripts..And more Scripts
The Sitemap feature requires that you upload a sitemap.xml file that describes your sites layout (Literally a list of urls in XML) with those extra bits information. You then submit the url of that file to the Google Sitemap site:
www.google.com/webmasters/sitemaps
There are already a bunch of scripts for generating this sitemap.xml file, but I had nothing but problems with all of them*. Mainly for the reason on how they gather the urls, so I have written my own script to enable better url coverage of a site.
* This was at the time of writing, some may improve in time. This statement only includes free scripts or free online tools.
Note: This script comes with no warranties or guarranties. Use at your own risk.
How To Use
To use this script you need a local installation of PHP. You can get that here: php.net
Note: The installation of PHP is a little beyond the scope of this article. If you are using a windows machine you could also check out easyphp.org for a install wizard version of PHP.
Next, You need the script which can be found here: sitemap.php
After you have the script, you need to edit a few parameters in the sitemap.php file. The parameters you need to adjust are:
- $siteurl - This should point to the start of where you want to include in your sitemap.xml file. Example would be http://www.improvedsource.com/ (Remember to include the last slash [ / ] )
- $frequency - The frequency that you change your website. Valid options are: always, hourly, daily, weekly, monthly, yearly, never
- $priority - Priority of page in relation to other parts of your site. A number from 0.1 to 1.0 is acceptable.
- $lastmodification - Set to true or false. Tells the script to include last modification date.
- $extensions - You may not need to touch this. This is the extension that are ok to include in the sitemap. Some possible extensions left out are "aspx" for ASP.net sites.
- $index_dynamic_pages_params - You may not need to touch this setting either. It basically will include urls with parameters in them (example http://improvedsource.com/test.php?param1=test) in your sitemap. This feature will break session variables so you probably want to leave it set to false.
Next, save the sitemap.php file and run it at the console or command line like so:
On Windows:
c:> c:/some_path_to_php/php.exe sitemap.php
Or Unix:
# /some/path/php sitemap.php
Finally, upload the generated sitemap.xml file to your website and submit to the Sitemap feature on the Google website.
Conclusion
I hope you find this script of use. If you do, feel free to link to us at "http://www.improvedsource.com". Have fun.
Other ImprovedSource Articles:
How to speed up the rendering of your website
Why we need a Javascript-Based Database?
PHP v5.2 vs PHP v5.1