[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[mat-dev] A suggestion: would you benefit from the 'jzran' library for random-access gzip archives?


A while ago I wrote a library for random access to gzip archives -
jzran http://code.google.com/p/jzran . I originally wrote it for the
logophagus project http://code.google.com/p/logophagus , but hoped to
find other uses for it.

The library is BSD-licensed, so basically free for any kind of usage.

I wonder if the Eclipse Memory Analyzer would benefit from it? I think
it could be cool to open/analyze gzipped .hprof files without
decompressing them (it's quite a frequent situation e.g. in Yandex
among my ex-colleagues - you gzip a profile on a remote server, copy
it to your machine, decompress and study it with yjp). Perhaps in some
cases it could even be faster then opening uncompressed ones. Or maybe
you could store some of your indices in compressed form and use jzran
to read them.

The answer of course depends very much on the access pattern - this
library is basically designed for relatively long reads from arbitrary
positions in the file; in a "random read" scenario it would be slower.

I looked at HprofRandomAccessParser and
BufferedRandomAccessInputStream - looks like in combination with the
latter, jzran could do the trick - assuming that even many "seemingly
random" reads are actually sequential (e.g. many objects in an object
array actually allocated one after another). (this is the first time
I'm looking at the codebase, so I may be wrong)

Yeah, I know, I should just write a patch myself :) but while I'm
being guilty not doing this, I thought that writing to the mailing
list would still be better than doing nothing at all.
Please tell me what you think.

Eugene Kirpichov
Principal Engineer,
Mirantis Inc. http://www.mirantis.com/