Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [smila-dev] OutOfMemoryException during Crawl

Hi Daniel,

1) why does BinStore use so much memory and does not free anything ?

[Marius] : current binstorage implementation does not release the file system manager (commons vfs). The commons vfs manager is being used to manage file persistence. By closing it, all files created by this manager will be closed (anyway, as i've seen, currently each file is being closed after creating it ... ), and cleans up any temporary files. This "release opportunity" shall be called only when the commons-vfs manager is no longer needed ... I'm not very optimist that applying this will solve the OOM at all. Anyway I 'll do some local tests and come back with an answer. There is one more thing that can be configured related to OOM issue (binstorage & commons-vfs) : the cache strategy which currently is set to refresh data every time the app. request a file - which is fine (the other two options are manually call to refresh (this would be better in OOM case, but the time/response will increase) and refresh data every time an instance is referred ... which is not an appropriate solution)

2) why are those threads created continually and what are they for ?

3) what causes this slow but linear increasing consumption of memory ?

[Marius]: The xml storage (by using the Oracle Berkeley DB Xml) represents an important memory consumption .... during my tests I often ended-up in OOM. The org.eclipse.smila.xmlstorage bundle takes care of resources releasing during the XML data processing. The idea is (I would call this an disadvantage of BDB Xml) that users/developers shall determine/estimate the volume of data which is going to be processed (parsed/stored/fetched) into the BDB Xml container(s) from the very beginning, before opening (starting up the BDB Xml). In many cases any re-configuration of BDB xml environment will have no effect until it gets restarted. When dealing with huge amount of data (if there are also many concurrent access users) situations like OOM or "unable to allocate memory for mutex" (error just reported by Ralf) can occur. As a conclusion, the xmlstorage uses memory - depending on the processed data; but the releasing resources techniques are applied... so, the "linear increasing consumption of memory" shouldn't be because of xmlstorage.

Best Regards,
Marius

----- Original Message ----- From: <Daniel.Stucky@xxxxxxxxxxx>
To: <smila-dev@xxxxxxxxxxx>
Sent: Thursday, October 09, 2008 6:31 PM
Subject: [smila-dev] OutOfMemoryException during Crawl


Hi all,

during testing we encountered an OutOfMemoryException. My first thought
was that the cause is DeltaIndexingManager, as it holds its state in
memory. But it was not the case, as the problem remained after disabling
it. So we did some more tests (disabling different components) to track
down the root of the problem. Here are some interesting observations we
made:

BinStore: the BinStore allocates as much memory as possible, without
freeing any. With the BinStore active the system reaches the defined XmX
quickly. The OOM takes a lot longer to come in effect though. I guess
it's related to the caching mechanisms of Commons VFS. I don't
understand why no memory is freed if it's near reaching XmX. The
behavior is the same whether XmX is 512m or 64m. If BinStore is
disabled, the system does not reach XmX, but uses an average of 15 MB !

However, the BinStore seems not to be the real problem, but it makes the
OOM easier to occur, as it uses so much memory.
We did a test (indexing ov ofer 200.000 documents) with only
Connectivity, XMLStore and BinStore (no routing, no BPEL, no processing
was executed, the services where started though). The Memory was quickly
near reaching XmX, but all documents where processed (that means
converted to records, send and stored in XML- and BinStore) in 2 hours.
Several hours later the OOM exception occurred, without the system doing
anything.

Then I noticed that by just starting EILF.exe and waiting the following
happens:
- every minute 3 threads are created and deleted. The number of active
threads is avg 40, but the total number of created threads increases
constantly. The names of those threads were "Persistence Adaptor Task"
and "RMI TCP Connection(8)-172.24.187.35". I guess it's related to
activmq
- the memory usage also increases. It increases very slowly, but the
graph is growing linear. I think that this behavior finally causes the
OOM Exception.

So there are 3 questions I cannot answer:
1) why does BinStore use so much memory and does not free anything ?
2) why are those threads created continually and what are they for ?
3) what causes this slow but linear increasing consumption of memory ?

Any ideas, suggestions, or solutions ?

Bye,
Daniel
_______________________________________________
smila-dev mailing list
smila-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/smila-dev





Back to the top