next up previous
Next: Collector - Updating Up: Updating in Rumour Previous: Updating in Rumour

Jack - Updating Lindex and Cindex

Jack is a small program that traverses the local file system, and collects keywords from the hypertext files that the related Web server has access to. Currently it takes keywords from the meta tags from the .html pages and puts them in the appropriate format. Jack will make both the Lindex and Cindex - in that order, and can be called at a set time using crontab(1).

Jack has the following parameters:

				-ht_dir <directory name> - specify the htdocs directory the
                           		related Web server is using
				-lindex <filename>   - specify the lindex filename
				-cindex <filename>   - specify the cindex filename
				-server <servername> - specify the name of the related server
				-tmp <filename>      - specify the temp.  filename lindex can use
				-logfile <filename> - specify the log filename lindex should use
				-I <filename> - specify a file that lists additional directories
                				that jack should go through
				-i <filename> - specify a file that lists directories jack should
                				only go through (not implemented yet)
				+tilda   - specify that there should be no simplification to ~
				-h - help page

Jack traverses the file system in two ways. It can take directories the user specified, or it can traverse Web directories of each user in the file system by getting the usernames and home directories from the password file.

Jack will first parse the files with a .htm or .html suffix, in a directory, then it will go into sub-directories to repeat the process. As a contingency plan, Jack will stop after descending 3 levels of sub-directories, so it will not run out of control.

A problem has arisen while making Jack of recording URLs: Some Web servers do not allow the path of documents to be shortened, (for example /u/hon/twyt/public_html/ shortens to ~twyt/), so the option +tilde has been placed that stops Jack from simplifying the path.



Tommy Wing Yiu Tsui
Tue Nov 7 10:21:32 EST 1995