Hi,
I'm creating a Git server, and I'd like to use JGit as implementation. JGit
contains a module called `org.eclipse.jgit.http.server` which allows to achieve
this easily via GitServlet[1]. However, I need the Git server to be clustered,
to provide a scalable solution. I've two possible solutions, but I want
to have your opinions about them.
Solution 1: N GitServlets + 1 NFS
Use N Git servlets and share the same network filesystem. Each server
points the same file system in the network. This solution is used by GitLab,
Personally, I'm afraid of concurrent file access to Git repository, which leads
to data corruption. According to this post[2], Git has mechanism to protect
itself, e.g using index lock. But a Git bare repository does not have index,
right? I'm confused.
Solution 2: N GitServlets + N DfsRepository + KeyValue DB
JGit provides an abstract class `DfsRepository`[3] to create a DFS repository.
This solution is used by Palantir[4] and Google[5], where data is stored in a
distributed database. I think this solution is for big company, and requires complex
setup. I don't have confidence to be able to implement DfsRepository correctly
and maintain an extra DB.
My implementation will be used by thousands of repositories, but only a few of
them are actively used. Therefore, the concurrent access should be very limited.
I'd like to have your comment about this subject.
Thanks,
Mincong