I thought I would touch on what exactly a website map or sitemap is and how to make one and what to do with it. I will also talk about the robots.txt
Robots.txt
What is this?
The robots.txt file is a file you place on your website to tell the web spiders where they can and cannot go on your website.
One word of caution is that although the good web spiders will keep out of your website according to this file, others will not. So if you have content you really don't want to be seen, either move it, password protect it or whatever.
What does it look like?
The robots.txt file is simple a text file called robots.txt.
The main two parts to your robots.txt file are
- User-agent - this tells a specific spider what to do i.e Googlebot
- Disallow - this specifies which section of your site is restricted
- Allow - this specifies which section of your website is not restricted
The basic structure is
User-agent: Googlebot
Disallow:
User-agent:*
Disallow:/
This example simply says that only Google can visit your website, and all other spiders are not allowed to visit. To restrict specific folders or files just do:
Disallow:/afoldername (this restricts access to http://www.yoursite.com/afoldername)
Disallow:/getoff.htm (this restricts access to http://www.yoursite.com/getoff.htm)
So an example which says that you want everything except your images folder spidered looks like:
User-agent:*
Disallow:/images
This means everything except your images folder will be spidered.
Why use it?
The purpose behind the robots.txt file is to give a route into your website, and if you have a large website you should restrict as much as possible, as the spiders will only spend so much time on your website. So the best thing to do is hide content they cannot see anyway, such as images.
Where to put it.
One your have created a robots.txt file, simply uploaded it to the root directory of your website, that is the http://www.yourwebsite.com location on your server, and you should be able to view it on your website as http://www.yourwebsite.com/robots.txt
Website Map or Sitemap
What is a website map or sitemap
The purpose of a sitemap if that it gives both the search engine spiders a useful map to follow when looking at your website, but it does the same for your users.
The search engines look at your site map and it give them an updated view of your website, which helps keep the indexing of your website up to date in the seaarch engine listings.
What does it look likeThe website map or sitemap comes in 3 forms
- HTML Form - this is the one for your end users
An HTML sitemap can been any form you wish that gives a nice visual overview of your website, it is normally a cut down page on your website with a simply list of links on your website.
- XML Sitemap - this is an XML list of your website, containing the links on your website, this is the one you submit to the search engines.
The XML site map normally contains the following XML nodes
- url
- loc - the url of the page
- lastmod - the modified date
- changefreq - how often page is updated
- priority - how important in your website is this page (0.0 - 1.0)
The url node need to be repeated for each link on your website.
Don't worry if it looks scary there are lots of good resources online to create your own XML website / sitemap.
- http://www.xml-sitemaps.com/
- http://tools.webmasters.sk/sitemap-creator.php
- http://www.thesitemapper.com/
But a simple search will help you find lots of great sitemap creators, so you don't need to be scared.
- Text Based - This is a text based version of your sitemap
The text sitemap is just a simple list of your links on your website
For Example
http://www.buzzproperties.co.uk/
http://buzzproperties.co.uk/
Now I have my sitemap what do I do with it
- Simply save each file with any name you like
- upload the file to your website root i.e http://buzzproperties.co.uk/sitemap.xml
- Register for Google Webmaster Tools at https://www.google.com/webmasters/tools
- Login to your webmaster tools account
- Add your website url into the add site box
- Click on your website in the list and click on Sitemaps on the left menu
- Click add a sitemap and enter the url of the sitemap on your website (your can add both the text and XMl sitemap to your listing)
Google will validate your sitemaps and if everything is OK it will let you know in the status.
Google now knows about your sitemap, and everytime you make changes just update your sitemap and upload, and resubmit it - that is unless none of the links have changed.
Thanks
Sean J Connolly
Visit AJAX Web Development Store
Follow me on Twitter - http://twitter.com/seanjc
Find me on MySpace - http://www.myspace.com/seanjc
Find me on FaceBook - http://www.facebook.com/profile.php?id=567767993
DMS - Document Management, Webmail
BuzzProperties.co.uk - Online Property Sales and Letting

2 comments:
hi.I spent about a day on creation of a sitemap for my mom’s website. Finally, I did it and was very satisfied with my job. But when a friend asked me about the same thing for him…?..I said, sorry.
I don’t want to waste my time on that purpose anymore,… it would be smart of me to use a professional Sitemap Generator progrsam instead. No more manual creation of sitemaps! Only reliable XML Sitemap generator programs!
Great point, using automatic site creators is the best way to go. Or at least use on to validate your sitemap.
Thnx for giving people the pointers.
Sean
Post a Comment