sitecopy - maintain remote copies of web sites





SYNOPSIS


       sitecopy [options] [operation mode] sitename ...





DESCRIPTION


       sitecopy is for copying locally stored web sites to remote

       web servers.  A single command will upload  files  to  the

       server  which  have changed locally, and delete files from

       the server which have been removed locally,  to  keep  the

       remote  site synchronized with the local site.  The aim is

       to remove the hassle of uploading and deleting  individual

       files  using an FTP client.  sitecopy will also optionally

       try  to  spot  files  you  move  locally,  and  move  them

       remotely.



       FTP,  WebDAV  and  other HTTP-based authoring servers (for

       instance, AOLserver  and  Netscape  Enterprise)  are  sup-

       ported.





GETTING STARTED


       This  section  covers  how to start maintaining a web site

       using sitecopy.  After introducing the basics, two  situa-

       tions  are covered: first, where you already have a remote

       copy of the site; second, where you don't.  Lastly, normal

       site maintenance activities are explained.



   Introducing the Basics

       If  you  have  not  already done so, you need to create an

       rcfile, which will store information about the  sites  you

       wish  to  administer.  You  also  need to create a storage

       directory, which sitecopy uses to record the state of  the

       files  on each of the remote sites. The rcfile and storage

       directory must both be accessible only by you  -  sitecopy

       will  not  run otherwise.  To create the storage directory

       with the correct permissions, use the command

            mkdir -m 700 .sitecopy

       from your home directory. To create the  rcfile,  use  the

       commands

            touch .sitecopyrc

            chmod 600 .sitecopyrc

       from  your  home  directory.  Once  this is done, edit the

       rcfile to enter your site details as shown in the CONFIGU-

       RATION section.



   Existing Remote Site

       If  you already have a local copy of the site, ensure your

       local files are synchronized with the remote files.  Then,

       run

       site keyword in the rcfile.



       If you do not have a local copy of the remote  site,  then

       you  can  use fetch mode to discover what is on the remote

       site, and synchronize mode  to  download  it.  Fetch  mode

       works  well  for  WebDAV servers, and might work if you're

       lucky for FTP servers. Run

            sitecopy --fetch sitename

       to fetch the site - if this succeeds, then run

            sitecopy --synch sitename

       to download a local copy.



   New Remote Site

       Ensure that the root directory of the site has  been  cre-

       ated on the server by the server administrator. Run

            sitecopy --init sitename

       where  sitename is the name of the site you used after the

       site keyword in the rcfile.



   Site Maintenance

       After setting up the site as given in one of the two above

       sections,  you  can  now start editing your local files as

       normal. When you have finished a set of changes,  and  you

       want to update the remote copy of the site, run:

            sitecopy --update sitename

       and  all the changed files will be uploaded to the server.

       Any files you delete locally will be deleted remotely too,

       unless  the nodelete option is specified in the rcfile. If

       you move any files between directories, the  remote  files

       will be deleted from the server then uploaded again unless

       you specify the checkmoved option in the rcfile.



       At any time, if you wish to see what changes you have made

       to the local site since the last update, you can run

            sitecopy sitename

       which will display the list of differences.



   Synchronization Problems

       In  some circumstances, the actual files which make up the

       remote site will be different from what sitecopy thinks is

       on  the remote site. This can happen, for instance, if the

       connection to the server is broken during an update.  When

       this  situation arises, Fetch Mode should be used to fetch

       the list of files making  up  the  site  from  the  remote

       server.





INVOCATION


       In normal operation, specify a single operation mode, fol-

            sitecopy --update --quiet mainsite anothersite

       will quietly update the sites named 'mainsite' and 'anoth-

       ersite'.





OPERATION MODES


       -l, --list

              List Mode - produces a listing of all  the  differ-

              ences  between  the local files and the remote copy

              for the specified sites.



       -ll, --flatlist

              Flat list Mode - like list mode, except the  output

              produced  is  suitable  for  parsing by an external

              script or program. An AWK script, changes.awk.   is

              provided  which  produces  an  HTML  page from this

              mode.



       -u, --update

              Update Mode - updates the remote copy of the speci-

              fied sites.



       -f, --fetch

              Fetch  Mode  -  fetches  the list of files from the

              remote server.  Note that this mode has  only  lim-

              ited  support  in  FTP - the server must accept the

              MDTM command, and use a Unix-style  'ls'  for  LIST

              implementation.



       -s, --synchronize

              Synchronize  Mode - updates the local site from the

              remote copy.  WARNING: This mode  overwrites  local

              files. Use with care.



       -i, --initialize

              Initialization  Mode - initializes the sites speci-

              fied - making sitecopy think there are NO files  on

              the remote server.



       -c, --catchup

              Catchup  Mode - makes sitecopy think the local site

              is exactly the same as the remote copy.



       -v, --view

              View Mode - displays all the site definitions  from

              the rcfile.



       -h, --help

              Display help information.



       -V, --version

              Display version information.



       -y, --prompting

              Applicable  in  Update  Mode  only, will prompt the

              user for confirmation for each update (i.e., creat-

              ing a directory, uploading a file etc.).



       -r RCFILE, --rcfile=RCFILE

              Specify an alternate run control file location.



       -p PATH, --storepath=PATH

              Specify an alternate location to use for the remote

              site storage directory.



       -q, --quiet

              Quiet output - display the filename only  for  each

              update performed.



       -qq, --silent

              Very quiet output - display nothing for each update

              performed.



       -o, --show-progress

              Applicable  in  Update  Mode  only,  displays   the

              progress (percentage complete) of data transfer.



       -a, --allsites

              Perform the given operation on all sites - applica-

              ble for all modes except View Mode,  for  which  it

              has no effect.



       -d MASK, --debug=MASK

              Turns  on  debugging.  The  integer  MASK specifies

              which functions debugging is produced for,  and  is

              the sum of any of the following:

                   1    Socket handling

                   2    File handling

                   4    rcfile parser

                   8    HTTP driver

                   16   FTP driver

                   32   XML parser

                   64   GNOME interface

                   128  HTTP authentication





CONCEPTS


       The stored state of a site is the snapshot of the state of

       the site saved into the storage directory  (~/.sitecopy/).

       The  storage  file  is  used  to record this state between

       invocations. In update mode, sitecopy builds  up  a  files

       list  for each site by scanning the local directory, read-

       ing in the stored state, and comparing the two - determin-

       ing which files have changed, which have moved, and so on.



       Configuration  is  performed  via  the  run  control  file

       (rcfile).   This  file contains a set of site definitions.

       A unique name is assigned to every site definition,  which

       is used on the command line to refer to the site.



       Each  site  definition  contains the details of the server

       the site is stored on, how the site  may  be  accessed  at

       that  server, where the site is held locally and remotely,

       and any other options for the site.



   Site Definition

       A site definition is made up of a series of lines:



       site sitename

          server server-name

        [ port port-number ]

        [ proxy-server proxy-name

          proxy-port port-number ]

        [ url siteURL ]

        [ protocol { ftp | http } ]

        [ ftp nopasv ]

        [ ftp showquit ]

        [ http expect ]

        [ safe ]

        [ state { checksum | timesize } ]

          username username

          password password

          remote remote-root-directory

          local local-root-directory

        [ permissions { ignore | exec | all } ]

        [ symlinks { ignore | follow | maintain } ]

        [ nodelete ]

        [ nooverwrite ]

        [ checkmoved[renames] ]

        [ tempupload ]

        [ exclude pattern ]...

        [ ignore pattern ]...

        [ ascii pattern ]...



       Anything after a hash (#) in a line is ignored as  a  com-

       ment.   Values  may  be quoted and characters may be back-

       slash-escaped.  For example, to use  the  exclude  pattern

       *#, use the following line:

            exclude "*#"



   Remote Server Options

       The  server  key  is used to specify the remote server the

       site is stored on.  This may be either a DNS  name  or  IP

       address.  A connection is made to the default port for the

       protocol used, or that given by the  port  key.   sitecopy

       ftp respectively. By default, FTP will be used.



       The  proxy-server and proxy-port keys may be used to spec-

       ify a proxy server to use.  Proxy  servers  are  currently

       only supported for WebDAV.



       If  the  FTP  server does not support passive (PASV) mode,

       then the key ftp nopasv should be used.   To  display  the

       message  returned by the server on closing the connection,

       use the ftp showquit option.



       If the WebDAV server correctly supports  the  100-continue

       expectation,  e.g.  Apache  1.3.9  and later, the key http

       expect should be used. Doing so can  save  some  bandwidth

       and time in an update.



       To authenticate the user with the server, the username and

       password keys are used. If it exists, the ~/.netrc will be

       searched  for  a  password  if  one  is not specified. See

       ftp(1) for the syntax of this file.



       Basic and digest authentication are supported for  WebDAV.

       Note that basic authentication must not be used unless the

       connection is known to be secure.



       The full URL that is used to access the site  can  option-

       ally  be  specified  in  the url key. This is used only in

       flat list mode, so the site URL can be inserted in 'Recent

       Changes'  pages. The URL must not have a trailing slash; a

       valid example is

            url http://www.site.com/mysite



       If the tempupload  option  is  given,  changed  files  are

       upload  with a ".in." prefix, then moved to the true file-

       name when the upload is complete.



   File State

       File state is stored in the storage files (~/.sitecopy/*),

       and is used to discover when a file has been changed.  Two

       methods are supported, and can be selected using the state

       option, with either parameter: timesize (the default), and

       checksum.



       timesize uses the last-modification date and the  size  of

       files  to detect when they have changed.  checksum uses an

       MD5 checksum to detect any changes to the file contents.



       Note that MD5 checksumming involves reading in the  entire

       file,   and   is   slower  than  simply  using  the  last-

       modification date and size. It may be useful for  instance

       if  a  versioning system is in use which updates the last-



   Safe Mode

       Safe  Mode is enabled by using the safe key. When enabled,

       each time a file is uploaded to the server, the  modifica-

       tion time of the file as on the server is recorded. Subse-

       quently, whenever this file has been changed  locally  and

       is  to be uploaded again, the current modification time of

       the file on the server is retrieved, and compared with the

       stored value. If these differ, then the remote copy of the

       file has been altered by a foreign party.  A warning  mes-

       sage  is  issued, and your local copy of the file will not

       be uploaded over it, to prevent losing any changes.



       Safe Mode can be used with FTP or WebDAV servers,  but  if

       Apache/mod_dav   is  used,  mod_dav  0.9.11  or  later  is

       required.



       Note Safe mode cannot be  used  in  conjunction  with  the

       nooverwrite option (see below).



   File Storage Locations

       The  remote key specifies the root directory of the remote

       copy of the site.  It may be in the form  of  an  absolute

       pathname, e.g.

            remote /www/mysite/

       For  FTP,  the directory may also be specified relative to

       the login directory, in which case it must be prefixed  by

       "~/", for example:

            remote ~/public_html/



       The local key specifies the directory in which the site is

       stored locally.  This may be given relative to  your  home

       directory  (as  given  by the environment variable $HOME),

       again using the "~/" prefix.

            local ~/html/foosite/

            local /home/fred/html/foosite/

       are equivalent, if $HOME is set to "/home/fred".



       For both the local and remote keywords, a  trailing  slash

       may be used, but is not required.



   File Permissions Handling

       File  permissions  handling is dictated by the permissions

       key, which may be given one of three values:



       ignore to ignore file permissions completely,



       exec   to mirror the permissions of executable files only,



       of CGI files are set. The option is currently ignored  for

       WebDAV  servers.  For  FTP  servers,  a chmod is performed

       remotely to set the permissions.



   Symbolic Link Handling

       Symlinks found in the local site can  be  either  ignored,

       followed,  or maintained. In 'follow' mode, the files ref-

       erences by the symlinks will be uploaded in  their  place.

       In  'maintain'  mode, the link will be created remotely as

       well, see below for more information. The  mode  used  for

       each site is specified with the symlinks rcfile key, which

       may take the value of ignore, follow or maintain to select

       the mode as appropriate.



       The  default  mode is ignore, i.e. symbolic links found in

       the local site are ignored.



   Symbolic link Maintain Mode

       This mode  is  currently  only  supported  by  the  WebDAV

       driver,  and  will  work only with servers which implement

       WebDAV Advanced Collections, which is a  work-in-progress.

       The  target  of the link on the server is literally copied

       from the target of the symlink. Hint: you can use URL's if

       you like:

            ln -s "http://www.somewhere.org/" somewherehome



       In  this  way,  a "302 Redirect" can be easily set up from

       the client, without having to alter the server  configura-

       tion.



   Deleting and Moving Remote Files

       The  nodelete  option  may be used to prevent remote files

       from ever being deleted. This may be useful  if  you  keep

       large  amounts  of  data on the remote server which you do

       not need to store locally as well.



       If your server does not allow you to upload changed  files

       over  existing  files,  then  you  can use the nooverwrite

       option. When this is  used,  before  uploading  a  changed

       file, the remote file will be deleted.



       If  the  checkmoved option is used, sitecopy will look for

       any files which have been moved locally. If any are found,

       when  the  remote site is updated, the files will be moved

       remotely.



       If the checkmoved renames option is  used,  sitecopy  will

       look  for  any  files  which  have  been  moved or renamed

       locally. This option may only be used in conjunction  with



       If  you  are  not  using  MD5 checksumming (i.e. the state

       checksum option) to determine file state, do NOT  use  the

       checkmoved  option  if you tend to hold files in different

       directories with identical sizes, modification  times  and

       names  and  ever move them about. This seems unlikely, but

       don't say you haven't been warned.



   Excluding Files

       Files may be excluded from the files list by  use  of  the

       exclude  key, which accepts shell-style globbing patterns.

       For example, use

            exclude *.bak

            exclude *~

            exclude "#*#"

       to exclude all files which have a .bak extension, end in a

       tilde (~) character, or which begin and end with a a hash.

       Don't forget to quote or escape the value if it includes a

       hash!



       To  exclude  certain files within an particular directory,

       simply prefix  the  pattern  with  the  directory  name  -

       including a leading slash. For instance:

            exclude /docs/*.m4

            exclude /files/*.gz

       which will exclude all files with the .m4 extension in the

       'docs' subdirectory of the site, and all  files  with  the

       .gz extension in the files subdirectory.



       An  entire directory can also be excluded - simply use the

       directory name with no trailing slash. For example

            exclude /foo/bar

            exclude /where/else

       to exclude the 'foo/bar' and  'where/else'  subdirectories

       of the site.



       Exclude  patterns  are  consulted  when scanning the local

       directory, and when performing a "fetch".  In both  cases,

       any file which matches any exclude pattern is not added to

       the files list.



   Ignoring Local Changes to Files

       The ignore option is used to instruct sitecopy  to  ignore

       any  local  changes made to a file. If a change is made to

       the contents of an ignored file, this  file  will  not  be

       uploaded  by  update  mode. Ignored files will be created,

       moved and deleted as normal.



       The ignore option is used in the same way as  the  exclude

       option.

       ignored files.



   FTP Transfer Mode

       To specify the FTP transfer mode for files, use the  ascii

       key. Any files which are transferred using ASCII mode have

       CRLF/LF translation performed appropriately. For  example,

       use

            ascii *.pl

       to  upload all files with the .pl extension as ASCII text.

       This key has no effect with WebDAV (currently).





RETURN VALUES


       Return values are specified for different operation modes.

       If  multiple  sites are specified on the command line, the

       return value is in respect to the last site given.



   Update Mode

        -1 ... update never even started - configuration problem

         0 ... update was entirely successful.

         1 ... update went wrong somewhere

         2 ... could not connect or login to server



   List Mode

        -1 ... could not form list - configuration problem

         0 ... the remote site does not need updating

         1 ... the remote site needs updating





EXAMPLE RCFILE CONTENTS


   FTP Server, Simple Usage

       Fred's site is uploaded to the FTP server  'my.server.com'

       and  held  in the directory 'public_html', which is in the

       login directory. The site is stored locally in the  direc-

       tory /home/fred/html.



       site mysite

         server my.server.com

         url http://www.server.com/fred

         username fred

         password juniper

         local /home/fred/html/

         remote ~/public_html/



   FTP Server, Complex Usage

       Here,   Freda's   site  is  uploaded  to  the  FTP  server

       'ftp.elsewhere.com', where it is  held  in  the  directory

       /www/freda/.    The    local    site    is    stored    in

         server ftp.elsewhere.com

         username freda

         password blahblahblah

         local /home/freda/sites/elsewhere/

         remote /www/freda/

         # Freda wants files with a .bak extension or a

         # trailing ~ to be ignored:

         exclude *.bak

         exclude *~



   WebDAV Server, Simple Usage

       This example shows use of a WebDAV server.



       site supersite

         server dav.wow.com

         protocol http

         username pow

         password zap

         local /home/joe/www/super/

         remote /





FILES


       ~/.sitecopyrc Default run control file location.

       ~/.sitecopy/ Remote site information storage directory

       ~/.netrc Remote server accounts information





BUGS


       Known problems: Fetch + synch  modes  for  FTP  have  been

       tested about twice - they WILL delete all your files.



       Please  send  bug  reports  and feature requests to <site-

       copy@lyra.org> rather than to the author, since the  mail-

       ing list is archived and can be a useful resource for oth-

       ers.





SEE ALSO


       rsync(1), ftp(1), mirror(1)





STANDARDS


       [Listed for reference only, no claim of compliance to  any

       of the below standards is made.]



       RFC 959 - File Transfer Protocol (FTP)

       RFC 1521 - Multipurpose Internet Mail Extensions Part One

       RFC 1945 - Hypertext Transfer Protocol -- HTTP/1.0

       RFC 2396 - Uniform Resource Identifiers: Generic Syntax

       RFC  2518  -  HTTP Extensions for Distributed Authoring --

       RFC 2617 - HTTP Authentication

       REC-XML - Extensible Markup Language (XML) 1.0

       REC-XML-NAMES - Namespaces in XML





DRAFT STANDARDS


       draft-ietf-ftpext-mlst-05.txt - Extensions to FTP

       draft-ietf-webdav-collections-protocol-03.txt   -   WebDAV

       Advanced Collections Protocol





AUTHOR


       Joe Orton <joe@orton.demon.co.uk> and others.





Add/View Comments.

Back to my homepage


Press the "Mind It!" button to receive e-mail each time this page is updated.


Author: siebert <webmaster@SteffenSiebert.de> Last Update: 2000/09/05 22:20:09
URL: http://www.SteffenSiebert.de/ports/sitecopy_doc.html