Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] jgit and huge files

On Tue, Mar 20, 2012 at 04:19, Marc Strapetz <marc.strapetz@xxxxxxxxxxx> wrote:
> On 19.03.2012 21:24, Shawn Pearce wrote:
>> On Mon, Mar 19, 2012 at 12:38, Marc Strapetz <marc.strapetz@xxxxxxxxxxx> wrote:
>>> Here is the stack trace (we have some debug code in PackFile which is
>>> definitely not related, hence I transformed only the relevant line
>>> number). Should "int sz" be a "long sz" instead?
>
>> java.lang.NegativeArraySizeException
>>        at org.eclipse.jgit.storage.file.PackFile.decompress(PackFile.java:296)
>>        at org.eclipse.jgit.storage.file.PackFile.load(PackFile.java:697)

Ick. Line 697 is the whole (non-delta) code path. We shouldn't have
had 2.2G less than the streamFileThreshold:

					if (sz < curs.getStreamFileThreshold())
						data = decompress(pos + p, (int) sz, curs);

sz here is a long and came from a loop higher up.
curs.getStreamFileThreshold() is an int and its maximum value should
be ~2047m, or 2^31-1. I expected the curs.getStreamFileTreshold() to
be upcast to a long, and for both longs to be positive when this
conditional is tested. If this is true then it should be safe to
truncate sz down from long to int because it is smaller an int's
maximum positive value.

Unfortunately something isn't holding true here. If you can add debug
code to examine "sz" and curs.getStreamFileTreshold()" at this line it
would be interesting to see what they actually are before the compare
starts. This would go very badly if sz was already negative for
example. A negative sz value indicates the pack data may be corrupt
here for this object because JGit wasn't able to decode the sz from
the packed object header in the loop above on line 685.


Back to the top