No announcement yet.

Recommended method to keep an svn mirror in sync per commit?

  • Filter
  • Time
  • Show
Clear All
new posts

  • Recommended method to keep an svn mirror in sync per commit?

    Hi, What is the recommended method to keep an svn mirror in sync each time a user commits? I have about 20 users posting commits throughout the day and I would like to have the svn mirror duplicate or copy the current commit from the source repo. I've tried putting this line in the source repo post-commit file: svnsync sync file:///mnt/svn_mirror/ where svn_mirror is the target repo but end up getting this error.
    svnsync sync file:///mnt/svn_mirror/
    Transmitting file data .....svnsync: E160004: Corrupt representation '366133 106082 55 1620 4bd1fbfd9712b0660edfb5f56e9ef672'
    svnsync: E160004: Malformed representation header

    If I run the svnsync sync file:///mnt/svn_mirror/ manually it works fine and if I run it on a less frequently used repo it works fine, but I'm guessing that when there are more users committing at a time or commits are taking a longer period of time it may cause an issue? Could it also be a date/time issue where if the date of the last synced revision on the target is too far behind the source, the next svnsync will fail?

    Both the source repo and the mirror target repo are on the same server.

    Other recommended methods to accomplish this? Maybe svn notify?

    My environment is:
    - svn 1.11.0
    - httpd-2.4.6
    - CentOS 7 64bit

  • #2
    In general, using the svnsync software should not end up with a "Corrupt representation" error message.

    I wonder if:
    1. you might have more than 1 version of Subversion installed on your server and the manual run is getting a different version?
    2. there's something subtly wrong in the environment of the post-commit hook that is causing the error.

    It's more likely #2 than #1 (of course, #1 is a subset of #2 - the PATH environment is different due to inheritance from Apache, SSHD, etc.

    What I like to do to debug #2 is to output the environment from the post-commit hook into a file and then compare that file with what you have in your interact shell...


    • #3
      I also have svnserve running periodically on demand when clients need to connect using svn+ssh for transferring a lot of files. Would this cause the corruption problem? I'll try your suggestion on #2 as well.


      • #4
        Shouldn't cause any troubles. However, one thing that you saw over in the "svndev" mailing list is that you need to coordinate the running of the svnsync command so that there aren't 2 of them running at the same time. This *can* happen if simply run out of the post-commit hook because there's no coordination for those runs and if 2 updates come in very quickly then the simplistic approach would start 2 of them. Ick.

        My work-around for this on Linux would be a Perl script that did "advisory locking" (e.g. "flock", et. al.) around the run of svnsync as a sub-process. Not sure how to do that on Windows.


        • #5
          What happens if two commits run at the same time and there is a lock on the file that runs the svnsync? Will the second post-commit svnsync not run at all and produce an error or will it wait till the lock is removed?

          Also would an & after the call to [B]&[/B] or after the svnsync sync file:///mnt/svn_mirror/ [B]& [/B]do any good if there is a lock being waited on? See code below.

          Also I read that in order to have the post-commit complete and not wait for background jobs to finish, the output has to be redirected to /dev/null
          #post-commit file
          e.g. >/dev/null 2>/dev/null &


          [CODE]#post-commit file

          # file
          # lock code copied from web

          #!/usr/bin/env bash


          # script running?
          [[ -s $PIDFILE ]] && exit

          # no, create pidfile
          echo $BASHPID > $PIDFILE

          # .. run svnsync
          svnsync sync file:///mnt/svn_mirror/

          # delete pidfile
          rm $PIDFILE[/CODE]


          • #6
            Use of "PIDFILE" as a locking mechanism is deeply problematic as there is an inherent race condition: they can both check at the same time and then both write the PIDFILE itself (the 2nd overwriting the 1st). I strongly discourage use of the "file existance locking heuristic" as it is quite literally impossible to properly and completely implement. It worked fine when computers were SLOW but now, well, it's crap.

            As for your original question, if you do the locking correctly (advisory locking on linux or strict locking on Windows) then both will run, serially, one after the other. The 1st will likely take care of both of them but just in case the 2nd will make sure of it.