Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Creating new refs on a repository with 100k packed-refs is slow

On Thu, Oct 8, 2020 at 12:15 AM <kaushikl@xxxxxxxxxxxxxx> wrote:
On 2020-10-07 14:39, Luca Milanesio wrote:
>> On 7 Oct 2020, at 22:33, kaushikl@xxxxxxxxxxxxxx wrote:
>> On 2020-10-06 16:23, Luca Milanesio wrote:
>> On 7 Oct 2020, at 00:17, kaushikl@xxxxxxxxxxxxxx wrote:
>> On 2020-10-06 12:50, Luca Milanesio wrote:
>> On 6 Oct 2020, at 20:48, kaushikl@xxxxxxxxxxxxxx wrote:
>> On 2020-10-06 12:32, Luca Milanesio wrote:
>> On 6 Oct 2020, at 20:16, kaushikl@xxxxxxxxxxxxxx wrote:
>> Hi,
>> We have noticed that creating a new ref takes ~600ms on a repository
>> with around 100k packed-refs.
>> My test repository:
>> $ git count-objects -v
>> count: 0
>> size: 0
>> in-pack: 100042
>> packs: 1
>> size-pack: 9734
>> prune-packable: 0
>> garbage: 0
>> size-garbage: 0
>> $ find refs/ -type f | wc -l
>> 0
>> Specs:
>> Machine has 32 cores and 250G RAM.
>> $ uname -r
>> 4.4.0-165-generic
>> $ lsb_release -a
>> No LSB modules are available.
>> Distributor ID:    Ubuntu
>> Description:    Ubuntu 16.04.6 LTS
>> Release:    16.04
>> Codename:    xenial
>> Jgit version is 5.9
>> $ java -version
>> openjdk version "1.8.0_242"
>> OpenJDK Runtime Environment (build
>> 1.8.0_242-8u242-b08-0ubuntu3~16.04-b08)
>> OpenJDK 64-Bit Server VM (build 25.242-b08, mixed mode)
>> Consider a small program[1] which creates a ref 'simple'. On
>> executing the program on my test repository, I see output:
>> simple: 677 ms
>> This seems slow. Is this expected behavior with jgit?
>> Can you share the generated .gitconfig?
>> (When you use JGit on a filesystem, it performs the computation of
>> the
>> filesystem latency)
>> $ cat config
>> [core]
>> repositoryformatversion = 0
>> filemode = true
>> bare = false
>> logallrefupdates = true
>> trustFolderStat = false
>
> I had trustFolderStat set to false in my test repositories, which
> causes packed-refs file to be parsed over and over even if no
> modifications have been made to it, which I think explains why I have
> been seeing long ref creation times.
>
> I believe it isn’t needed if you are using a local filesystem,
> isn’t it?
Agreed. It is not needed for local file systems.

Could you try https://git.eclipse.org/r/c/jgit/jgit/+/170138 again with trustfolderstat set to true ?
 
> Also for an NFS mount, I typically do not set trustFolderStat to
> false, otherwise the packed-refs are not cached at all: for a very
> large repo with lots of refs, that is a killer.
>
> Luca.
>
>> [repack]
>> usedeltabaseoffset = true
>> [pack]
>> compression = 9
>> indexversion = 2
>> threads = 6
>> windowmemory = 2g
>> window = 250
>> depth = 50
>> [gc]
>> autopacklimit = 4
>> packrefs = true
>> reflogexpire = never
>> reflogexpireunreachable = never
>> auto = 0
>> [receive]
>> denyNonFastForwards = false
>> No, I mean the ~/.gitconfig of the user. If you are running Gerrit
>> v3.1 or later, then it is $GERRIT_SITE/etc/jgit.config.
>> Ah, OK, I see below in ~/.gitconfig.
>> [filesystem "Private Build|1.8.0_232|/dev/mapper/xxxxxx_local_mnt"]
>> timestampResolution = 1001 microseconds
>> minRacyThreshold = 5289 microseconds
>> [filesystem "Private
>> Build|1.8.0_242|/dev/mapper/xxxxxxx-xxxxxxx_local_mnt"]
>> timestampResolution = 1001 microseconds
>> minRacyThreshold = 5486 microseconds
>> Mmm … interesting: the timestamp resolution is quite fine (1ms)
>> but
>> still it takes 600 ms to calculate?
>> Is this a local disk or a mounted NFS share?
>> SSD or spinning?
>> This is on a local spinning disk.
>> Is the disk fragmented?
>> I don't know how to check that, however, I copied the repository to
>> another machine with similar configuration and noticed that it was
>> showing similar performance( i.e greater than ~600ms) for ref
>> creation.
>> I believe you can short-circuit in JGit the filesystem timestamp
>> resolution, so that you can measure the pure filesystem-level
>> performance.
>> Luca.
>> Luca.
>> Luca.
>> [1]
>> package test;
>> import java.io.File;
>> import java.io.IOException;
>> import java.util.concurrent.TimeUnit;
>> import org.eclipse.jgit.api.Git;
>> import org.eclipse.jgit.api.errors.GitAPIException;
>> import org.eclipse.jgit.lib.Repository;
>> import org.eclipse.jgit.revwalk.RevCommit;
>> import org.eclipse.jgit.revwalk.RevWalk;
>> public class Test {
>> public static void main(String[] args) {
>> try {
>> String path = null;
>> if (args.length == 1) {
>> path = args[0];
>> } else {
>> System.out.println("Repo path must be specified.");
>> System.exit(1);
>> }
>> String branch = "simple";
>> try (Git git = Git.open(new File(path))) {
>> Repository repo = git.getRepository();
>> RevWalk walk = new RevWalk(repo);
>> RevCommit commit =
>> walk.parseCommit(repo.exactRef("refs/heads/master").getObjectId());
>> long startTimeInNanoSecs = System.nanoTime();
>> git.branchCreate().setName(branch).setStartPoint(commit).call();
>> long estimatedTimeInNanoSecs = System.nanoTime() -
>> startTimeInNanoSecs;
>> System.out.println(branch + ": " +
>> TimeUnit.NANOSECONDS.toMillis(estimatedTimeInNanoSecs) + " ms");
>> }
>> } catch (IllegalStateException | GitAPIException | IOException e)
>> {
>> e.printStackTrace();
>> }
>> }
>> }
>> _______________________________________________
>> jgit-dev mailing list
>> jgit-dev@xxxxxxxxxxx
>> To unsubscribe from this list, visit
>> https://www.eclipse.org/mailman/listinfo/jgit-dev
_______________________________________________
jgit-dev mailing list
jgit-dev@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/jgit-dev

Back to the top