Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Distributed Git Server using JGit

On Wednesday, May 16, 2018 10:12:39 PM Mincong Huang wrote:
> I'm creating a Git server, and I'd like to use JGit as
> implementation. JGit contains a module called
> `org.eclipse.jgit.http.server` which allows to achieve
> this easily via GitServlet[1]. However, I need the Git
> server to be clustered,
> to provide a scalable solution. I've two possible
> solutions, but I want to have your opinions about them.
> 
> Solution 1: N GitServlets + 1 NFS
> Use N Git servlets and share the same network filesystem.
> Each server points the same file system in the network.
> This solution is used by GitLab, see
> https://docs.gitlab.com/ee/administration/high_availabili
> ty/nfs.html 

I am not sure that doc is accurate.  I don't believe git or 
jgit uses the type of "Advisory locking" they are referring 
to, i.e. it does not lock files fcntl(), instead it uses lock 
files.  Lock files works on most NFS (v2,3,4) implementations, 
even without lockd.

NFS does have some caching issues though, and I suspect that 
the lookupcache=positive mount option mentioned in that doc 
would help, but I had not heard of it until now.  There are 
other NFS options to disable caching also that will work. 
They do impact performance, so most big installations do not 
use them.  I suspect lookupcache=positive would be 
acceptable for most installations performance wise.  There 
currently are a few paths in jgit which can be improved in 
order to get similar results to that mount option even 
without it.  Some of these have been mentioned on the 
repo/Gerrit mailing list recently, and we are currently 
internally working on some (today even).


> Personally, I'm afraid of concurrent file
> access to Git repository, which leads
> to data corruption. According to this post[2], Git has
> mechanism to protect itself, e.g using index lock. But a
> Git bare repository does not have index, right? I'm
> confused.

Correct, bare repos have no index.  That reference also 
mentions 1) git gc and 2) .keep files.  

#1 For git gc, there are some races, but this is true even 
without NFS.  

I have a change up for review for jgit here to help reduce 
one of these:  https://git.eclipse.org/r/c/122288/2  There's 
is a similar race for loose objects that also needs to be 
fixed.  That being said, that race has been around forever, 
and no one has bothered to fix it because it is very rare 
(although I do believe I have evidence of it happening for 
loose objects recently)

#2 Messing with .keeps should not cause corruption issues.  


Many people use Gerrit with NFS for very large 
installations, so using a simplified jgit GitServlet should 
work as well as Gerrit on NFS,

-Martin

-- 
The Qualcomm Innovation Center, Inc. is a member of Code 
Aurora Forum, hosted by The Linux Foundation



Back to the top