[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] core.streamFileThreshold and large objects

On Fri, Jun 1, 2012 at 2:42 AM, Markus Duft <markus.duft@xxxxxxxxxx> wrote:
> On 06/01/2012 11:12 AM, Marc Strapetz wrote:
>> When reading a LargePackedDeltaObject it usually (or always?) results in
>> a hang, hence I'd prefer to not use this code at all. To achieve that I
>> could increase the streamFileTheshold to Integer.MAX_VALUE. On the other
>> hand, it makes sense to stream a LargePackedWholeObject. So what about
>> introducing a system property in PackFile which avoids instantiation of
>> a LargePackedDeltaObject, basically:
>
> i debugged the code a while ago [1], and i realized, that it does not hang at all. it's just so dead slow, that it will never finish ;) i tried this with a 200MB file, and it managed to process ~100MB in 12 hours IIRC...
>
> maybe this should just be fixed instead of changing limits.

I think we just have to delete this code and never attempt to process
a delta using the streaming version. Whole packed objects can be
streamed, but deltas shouldn't. It doesn't work. The implementation is
correct, its just too slow to ever complete.

One way to improve it might be to collapse deltas together until you
reach the base object, and then apply that single delta onto the base.
>From what I understand, this is how Hg inflates deltas. It still
doesn't help needing to do random seeks within a large compressed base
object to copy fragments out of order. This was attempted on the C Git
code base a while back, it was slower than the existing C Git
algorithm.

I think its time to declare this grand experiment over, failed, and
delete that code.