Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [smila-dev] Problems with BinStorage

Hi,

There was a discussion about BinStorage redesign some time ago, where this problem was discussed too. Discussion started here: http://dev.eclipse.org/mhonarc/lists/smila-dev/msg00084.html
So I think BinStorage should be in process of redesign now..
Thanks,
Dmitry


Daniel.Stucky@xxxxxxxxxxx wrote:
Hi all,

we did some tests with a larger amount of data than in the usual
development cases to create some index dump files. The system performed
ok for about 2 hours, where 20 index dump files (each about 10 MB) were
created. The creation of the 21st file took about 30 min, the 22nd 4
hours.

I assume that one of the problems for the decreasing performance is the
BinStorage. For every record attachment a folder in
workspace\.metadata\.plugins\org.eclipse.eilf.binstorage\storage\default
with one file is created. After 7 hours it contained 109295 files (754
MB) and 109298 folders. NTFS (and also most linux filesystems) are not
optimized for such a huge amount of folders (or files) in ONE directory.

Remember that the goal is to index millions of documents! So we have to
change the behavior of BinStorage, it is a NO GO to store all documents
in one folder. I guess that the whole logic of BinStorage was programmed
by ourselves. Why did we do that ? Aren't there any implementations
already available in the open source community ? We should take a look
at how for example distributed filesystems like hadoup, or lucene stores
it's data. Or at least create a tree like structure beneath
org.eclipse.eilf.binstorage\storage\default.
Of course his is all up for discussion.

BTW: there is currently no documentation for BinStorage available in the
eclipse wiki. This should be added by the responsible developers.

Bye,
Daniel
_______________________________________________
smila-dev mailing list
smila-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/smila-dev


Back to the top