When Templeton starts, it first loads in the default configuration files. These include:
Additional configuration files may be specified as command line parameters.
Enter starting URL: | This allows you to specify where Templeton should begin.
The generic host form is: host[.domain][:port][/path[/file[...]]]
In general, URL is in one of the forms: Trailing slashes are optional, but should be used when applicable. If the host name does not have a domain, such as "intel", then Templeton will look for a local computer called "intel". If the local computer cannot be found, then it will look for "www.intel.com". |
Enter local path to store files ["none" for log files only]: | This command asks where retrieved files should be placed. You may either enter a path (i.e. D:\FILES\ or /tmp/retrieve) or the word "none". "None" informs Templeton not to retrieve files. If you operate your own web server, you may specify the root directory for that web server. |
Host restriction [yes|no|host|.domain]: | This is the first restriction option. Templeton has the ability to
retrieve from many machines, a few machines, or only one machine.
|
Should the host's subtree be restricted [yes|no|/path]: | When restricting to a specific host, you may also specify a restrictive subtree on the host. Templeton will not follow links beyond the specified subtree. Entering "yes" will restrict searches to the subtree specified in the initial URL. For example, http://c.gp.cs.cmu.edu:5103/prog/webster has the initial path "/prog". HTML documents not in the /prog directory would not be retrieved. Entering "no" places no restriction on the path, allowing Templeton to wander over the entire web site. Alternately, you may specify a path. This is useful when the starting URL is not the top of the directory tree. (Frequently, a web page may not be reachable from a page "above" it. This "lower" page may still be the "root" of the virtual subtree.) |
Enter maximum depth [0 for unlimited]: | This allows you to specify the number of links to follow. '1' will only return the web page specified by the initial URL. '2' will retrieve the initial URL and all links from that page (restrictions permitting). The larger the number, the more levels of indirect links that will be retrieved. Entering '0' will not restrict the number of links. If you are unsure of the number or links you will require, you should enter a finite number, such as '3', '5', or '10'. |
An example response is:
Enter starting URL: http://www.cs.tamu.edu/people/ Enter local path to store files ["none" for log files only]: /temp Host restriction [yes|no|host|.domain]: yes Should the host's subtree be restricted [yes|no|/path]: /people Enter maximum depth [0 for unlimited]: 3 |
Password required for realm = "Secret_Project" Enter user name: myusername Enter password: |
If you incorrectly enter your user name or password, Templeton will prompt you to enter them again. If you do not know a valid user name or password, then enter a hyphen "-" for both fields. This will skip the protected URL.
A note about security: Your user name and password are not secure. Basic authentication uses a simple encoding scheme -- so simple that many people can actually read the encryted text without a computer! Anyone with a computer between you and the WWW server can view your user name and password and use it. There is no inherent security.
/temp/mapindex.html | An HTML document showing file links on the remote site. This file can be changed by using the RemoteMapping option. |
/temp/locindex.html | An HTML document showing file links in the local save-path. If files are not retrieved, then this file is not generated. |
/temp/update.cfg | A generated configuration file allowing retrieved HTML documents to be updated when the server has a newer file. This option can be set by the Update-File setting. |
/temp/host.domain/ | Directory of files retrieved from the machine host.domain |
Current Depth: 2 (3 max) Links at current depth: 7 Total links remaining: 137 Current URL: http://www.cs.tamu.edu/people/ Local file: /temp/www.cs.tamu.edu/people/index.html IMAGE: Images/logos/csimage_basic.gif LINK: Images/index.html LINK: people/index.html ... |
The status also shows the current URL being processed and the name of the local file (when the URL is being mirrored). Under the local file are the type and name of all links that are found.
When all links are processed, the program will end. You may also break out of the program at any time.
? H or h | List the available commands |
a or A | Add a URL to be processed. You will be prompted for the URL to add. |
d or D | Change the maximum retrieval depth. |
i or I | Interrupt the current file downloading. When pressed while "reading" a file from a server, the reading will stop and regular processing will continue. When pressed during the processing of a file, the processing is stopped and the next file is retrieved. This can be very useful when Templeton tries to retrieve an undesirable file that is extremely large (or time-consuming). |
l or L |
List Restrictions.
Templeton supports robot exclusion. Typing 'L' shows all known
exclusion rules. There are 3 types of rules:
|
n or N | Toggle hostname resolution (see DNSLookup) |
s or S | Change the sleep interval. |
v or V | View the list of URLs to process. These are listed in the order that they will be processed, from top to botton. This list includes images, map files, and documents. |
w or W | Wait (pause) after network traffic completes.
This permits a user with a modem to pause, hang up, dial-in later, and
resume where they left off.
NOTE: Restrictions such as RestrictStopTime and RestrictDuration will be checked after you resume. |
q or Q | Quit Templeton after processing.
This option permits Templeton to finish the current web page download
and process any URLs found on the web page. Then, Templeton generates
a restart file which can be used to continue where you left off at a
later time.
The restart file contains all Templeton settings from the current execution. To resume where you left off, simply run Templeton with the restart file listed on the command line. For example, if the restart file is "/download/restart.cfg", they you would type "tton /download/restart.cfg". In general, it is best to use the restart file within a few days. Web sites that change frequently may make the restart file obsolete after a short time. Quitting with 'Q' may not be immediate. If you must exit immediately, use 'X'. Pressing 'Q' toggles the quit after processing option. If you press 'Q' twice, you will disable quit after processing and Templeton will continue (not quit). |
x or X | Exit Templeton.
Stop all network traffic and page processing. This exits without generating
a restart file. (If you want a restart file, use 'Q'.)
Pressing 'X' may not exit immediately. If Templeton is performing a hostname lookup, then Templeton will not exit until the lookup completes. In all other cases, Templeton will exit immediately. |
any other key | Any other key will pause the system. It is not considered "nice" to pause
the system while it is reading from the remote server since you will be
pausing a "live" network connection and taking valuable time from the
remote WWW server. "Live" connections that are paused for extended
durations will be closed by the remote server.
If you wish to pause after reading from the remote server, use 'W'. |