home ] [ sell downloads with paypal ] [ secure your site with paypal ] [ sell licence codes ] [ site map creator ] [ flash mp3 buttons ]
blog ] [ newsletter ]

thesitemapper - Create HTML and XML site maps

Create xml site map for google and other search engines

Bookmark this site
thesitemapper




Contents
Articles Build an xml site map

find out more :

The full registered version costs $30 U.S. Dollars

To go to the purchase page : click here.

A trial version is available by clicking here.

Enquiries : If you have any questions about the product, go to the contacts page by clicking here.

Documentation ....

thesitemapper is able to create an HTML Site Map which displays page titles, descriptions, and urls to be used as an index page for a web site and an XML Site Map to be used with search engines to help them identify new and used pages.

The application can be set up to automatically crawl a number of web sites, create an XML site map and html site map for each web site, and then ftp them to the required location on the appropriate web server. An in built scheduler enables the crawler to start at predefined times for each site.

General principles

Each site is set up independently, and you can set up as many sites as you wish. Each site can then be identified to be active so that crawling the site will automatically generate an HTML site map and XML site map when the crawler has completed. You may also manually generate the site maps at any time without re-crawling.

Clicking on the 'Crawl All sites ...' button will crawl all sites that have been identified as being active. When the site is first created, the site is always identified as active.

Clicking on the 'Start crawl' button associated with a particular site will only crawl that site.

Once crawling has completed, the results can be automatically ftp'd to the destination server and search engines pinged to indicate a new XML map.

There are a number of formats to choose from for the html page displays. These include single columns, 2-column, multi-page A to Z and various combinations of those. You may also create your own template web page to match the look of your site - the results can then be automatically inserted into the template web page.

Web settings page


web settings

Google Analytic's report

When you tick the box 'Check Google Analytics' on the Web Settings Page, the page report listing identifies if you have Google Analytics installed on the pages.

This is designed to help you configure Google Analytics, whether using the older urchin.js code or if you recently upgraded to the new ga.js tracking code. This diagnostic tool identifies pages on your web site that have GA tracking code properly installed. This makes it easy for you to isolate the pages with tracking problems, fix them, and effectively manage your Google Analytics installation.

Crawl settings page


crawl settings

This page sets up various crawling parameters.

XML site map page


xml site map settings

For a complete description of the meaning of these settings, refer to http://www.sitemaps.org/protocol.php

You may set the Change Frequency, File last modified and Priority.

HTML Site map page


html site map settings

Layout :

Fonts, Colors, etc button :

You may select the formatting of each page element by clicking on the Fonts, Colors etc button. You may either enter a css style name – which will need to exist in a style sheet for it to render correctly – or you may select fixed fonts, size and colors for the elements.

Other formatting button :

This button displays a set of options which may be used to alter other formatting definitions such as table cell padding, table cell spacing and so on.

When you create a new site, the format settings for fonts, colors etc are pre-defined to give a standard looking display.

Folder Alias button :

When you crawl the web site, the folder names are extracted and stored with the url. The folder names may then be displayed on the html site map to categorize the displays. However, the folder names are not always appropriate and the Folder Alias button allows you to enter a different folder name which will appear on the html site map.

HTML Templates :

If you wish to use your own web page layout in the form of an HTML page, enter the following at the point in the template file where you want the html site map to be displayed :

<!-- THESITEMAPPER -->

Then enter the file name into the "Template file" text box. When the html site map is created, it will place the site map at that point in the template file.

FTP Settings page


ftp settings

When a site is crawled, you can set the application to automatically ftp the results to your web server on completion.

First enter in your FTP settings, FTPHost, Username and Password. The XML Site Map will be ftp’d to the root of the web site.

Automatically FTP site maps when created - Tick this box to automatically ftp all the site maps when they are created.

Notify (ping) search URLs on completion - Tick this box so that the search sites are automatically pinged when the site maps are created and after they have been ftp’d to your site.

Set up Ping URLs page

More and more search engines are using the XML site map method and this form enables you to add new urls yourself, just add the root url to the list.

Enter the remote path for HTML site map on the server - This will be a folder name where you want the site map to be ftp’d to.

FTP XML Site Map and FTP HTML Site Map - These allow you to manually ftp the generated site maps to you web server – useful when you want to test the ftp system.

Ping Search URLs - This allows you to manually ping the search engines.

Scheduler page


scheduler

Ticking the ‘Enable for this site’ will Enable the scheduler. Choose the days when it should run and the time it should start from.

You may also use the Windows Scheduler to schedule the crawl. This is done using the command line as described by clicking here.

Results page

The results page is a simple display of the XML Site Map and also provides a validation of the XML Site Map.

Excluding text from the crawler

If you wish to exclude text from the crawler, such as menu, footer or other non relevant information, then use the following comments :

<!-- exclude_start -->
   text to be excluded
<!-- exclude_end -->

Command Line use

You may run the application from the Windows Command line using :

thesitemapper.exe

To start the crawl use

thesitemapper.exe crawl

which will cause all sites to be crawled and all indexes to be created.

Putting this into the Windows Scheduler will allow you to run the application at defined times without using the inbuilt scheduler system.