[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[mat-dev] A suggestion: would you benefit from the 'jzran' library for random-access gzip archives?
- From: Eugene Kirpichov <ekirpichov@xxxxxxxxx>
- Date: Tue, 5 Apr 2011 16:20:11 +0400
- Delivered-to: firstname.lastname@example.org
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=0kv1L1haHrmj4Gb12MwASGeMeWQm06t75CUyQZKBZuM=; b=ZgC0gWj13MWTkkgv1uxT1LHAN+ytkRGqtQ6viIdEX4QNGPlZz6pjlLTz1kEEDx/Ffe vro0oxcchHqAluAefE5uOmDaQJjh9b1fNoY++4vPNpeY1wxvvA3+x5nZzZf9V5GLCkaj cctUz19TSt2yGfUbLBm71/gnT95G+h8RgKwAg=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=Y559Z92299LBbxWtWS1Fvk9bcUAgUVRfdpZO9GJsFIxL1u+0pMDWLlNZUN+FCJWnXb a1ewgJ/Q+zYe3VVnoGxejJJxAO6Bago6lhmkOKZQz0P2RpTbdnjbxmmtV45B4g9LkDsp mOKuIVno6JKMSIDXdUDGr05pBbIDVjqtmvbpE=
A while ago I wrote a library for random access to gzip archives -
jzran http://code.google.com/p/jzran . I originally wrote it for the
logophagus project http://code.google.com/p/logophagus , but hoped to
find other uses for it.
The library is BSD-licensed, so basically free for any kind of usage.
I wonder if the Eclipse Memory Analyzer would benefit from it? I think
it could be cool to open/analyze gzipped .hprof files without
decompressing them (it's quite a frequent situation e.g. in Yandex
among my ex-colleagues - you gzip a profile on a remote server, copy
it to your machine, decompress and study it with yjp). Perhaps in some
cases it could even be faster then opening uncompressed ones. Or maybe
you could store some of your indices in compressed form and use jzran
to read them.
The answer of course depends very much on the access pattern - this
library is basically designed for relatively long reads from arbitrary
positions in the file; in a "random read" scenario it would be slower.
I looked at HprofRandomAccessParser and
BufferedRandomAccessInputStream - looks like in combination with the
latter, jzran could do the trick - assuming that even many "seemingly
random" reads are actually sequential (e.g. many objects in an object
array actually allocated one after another). (this is the first time
I'm looking at the codebase, so I may be wrong)
Yeah, I know, I should just write a patch myself :) but while I'm
being guilty not doing this, I thought that writing to the mailing
list would still be better than doing nothing at all.
Please tell me what you think.
Mirantis Inc. http://www.mirantis.com/