No announcement yet.

Perils of repository access via "file:///"

  • Filter
  • Time
  • Show
Clear All
new posts

  • Perils of repository access via "file:///"

    I have read the ominous warnings about users directly accessing a repository via "file:///" - but I still suspect it's the right way for me to work.

    I am creating an SVN repository on a server system. Remote users request server operations by use of an inetd service. Since the inetd service completely controls the procedure by which the originating root process changes into that for a user - there is no opportunity for a user's initialization script settings (umask, PATH, etc.) to be established. Ordinary users are not permitted to directly login to the server where the repository lives and the inetd runs.

    For my situation - use of any of the ordinary SVN server connection mechanisms seems pointless/redundant.

    My concerns are these:
    * When I first experimented with SVN and "file:///" access - if I ran heavy tests with multiple simultaneous operations - I encountered SVN internal errors that didn't crop up when I ran the tests sequentially. Are there any race conditions associated with accessing an SVN repository via "file:///" - which are not present for more conventional server connections?

    * More generally - besides carefully setting up the process that reads/writes the SVN repository - are there other risks/draw backs of directly accessing a repository with "file:///"?


  • #2
    Is your server environment Linux or Windows?

    Which release did you do the parallel testing?

    On a Windows box I can easily imagine locking issues during parallel updates via file:/// that would not occur on Linux.

    On both Linux and Windows "process context" will directly impact the code execution for file:///. By "process context" I mean environment variables (including PATH), open file descriptors, umask(), OS configured restrictions, etc. You might be surprised at how much damage even the wrong PATH or LANG environment variable can do, never mind something like LD_LIBRARY_PATH or LD_PRELOAD - or perhaps you're completely familiar with them.

    Short of some of the process context issues, as far as I understand things, there should be no other special risks/drawbacks around using file:/// than any other mechanism. If your testing shows reproducible errors in a carefully neutral process context I would encourage you to file bug reports with the Apache Subversion folks.


    • #3
      Server system is linux - SVN version 1.9.1.

      I certainly understand the perils of unexpected process context. My code handles turning the initial root process into the user - which only consists of setting up the user's collection of groups (as well as a repo access group) then setting their UID. My script is written in a recent version of python - starting either "svn" or "svnmucc" using subprocess.Popen() with the desired arguments provided as a list. So I think I'm starting with a clean fork/exec and no chance of a shell process and initialization creeping in to that sequence.

      My early problems were with code that I don't completely trust at this point - so I can well believe that the errors were my own inadvertent creations. Still - I wondered if anyone was aware of any kind of repo access race problems - because changing to "svn+ssh" seemed to "fix" the problems I had via "file:///". But as I've come to better understand "svn+ssh" - it sounds like "svn+ssh" wouldn't be apt to hide a race condition - where such a thing present in "file:///" access. I was running with "svn+ssh" for a while - but realized that I can't deploy that way since my users aren't allow to "ssh" to the system in question (even if they're running in a process that's "already there").

      Am I correct in understanding that the only real reason for AVOIDING "file:///" is precisely how easy it is for someone to carelessly introduce user-specific environment variable or umask process setup?

      Thanks for the quick response!
      Last edited by jrm03063; 04-23-2018, 07:55 PM.


      • #4
        Ah, you didn't mention having multiple accounts using the same repository. That's a different story. Umask plus differences in primary group (and group-sets) will eventually get you into a place where you'll have to "chown -R" the repo or some folks won't be able to update/checkout/checkin. This isn't a concurrency issue at all. And even if you've got a nice tool for making sure that all of that is standard, if they have access to the repo via the local FS and you've taught them to use file:/// then they're going to bypass your app eventually anyway.

        Note: Apache/SVN works extremely well out of the box and is pretty easy to configure (including path based AuthZ if you want it).


        • #5
          Well, I do control the procedure that turns the inetd spawned root process into the user. So I know there aren't going to be umask differences. Presumably I would also set up the repository in a directory tree that's "g+rws" with group ownership that's shared and certain.

          But that said - my boss has a preference for a "real server" of some kind or other. Also - I'm pretty sure I'll eventually want to allow something more like a vanilla SVN server interaction - and that's going to have to be either svnserve or Apache. My immediate concern with "real servers" is that means some manner of a real password - which I don't presently need to worry about.


          • #6
            LDAP (including Active Directory as an LDAP server) is relatively easy to incorporate into an Apache setup and completely eliminates any password governance concerns.