Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] delta generation during packing

lördagen den 10 juli 2010 04.21.58 skrev  Shawn O. Pearce:
> I just pushed my 'delta' series, which creates deltas on the fly
> while packing.  This brings us the functionality needed to perform
> `git repack`, or at least the first half of `git gc`.
> 
> Because this implementation was rebuilt from scratch based on my own
> memory of how the packing algorithm has evolved over the years in
> C Git, PackWriter, DeltaWindow, and DeltaEncoder don't use exactly
> the same rules everywhere, and that leads JGit to produce different
> (but logically equivalent) pack files:
> 
>   Repository | Pack Size (bytes)                | Packing Time
> 
>              | JGit     - CGit     = Difference | JGit / CGit
> 
>   -----------+----------------------------------+-----------------
>    git       | 25094348 - 24322890 = +771458    | 59.434s / 59.133s
>    jgit      |  5669515 -  5709046 = - 39531    |  6.654s /  6.806s
>    linux-2.6 |     389M -     386M = +3M        | 20m02s  / 18m01s
> 
> For the above tests pack.threads was set to 1, window size=10,
> delta depth=50, and delta and object reuse was disabled for both
> implementations.  Both implementations were reading from an already
> fully packed repository on local disk.  The running time reported
> is after 1 warm-up run of the tested implementation.
> 
> PackWriter is writing 771 KiB more data on git.git, 3M more on
> linux-2.6, but is actually 39.5 KiB smaller on jgit.git.  Being
> larger by less than 0.7% on linux-2.6 isn't bad, nor is taking an
> extra 2 minutes to pack.  On the running time side, JGit is at a
> major disadvantage because linux-2.6 doesn't fit into the default
> WindowCache of 20M, while C Git is able to mmap the entire pack and
> have it available instantly in physical memory (assuming hot cache).

I am impressed. I know Java is not slow in general, but I expected it
to be here. Not so.

How about memory usage? It is in my experience the area
where Java suffers most compares to C. Raw performance can usually
be achieved, but at the cost of memory usage, which may affect how
well this works when we pack from within Eclipse.

-- robin


Back to the top