Hi everyone,
I have been working with Jayes about 3 years and I love it. I work with thousands of nodes and sometimes I have nodes with lots of parent nodes.
I have modified my heap memory size and I have reached my limits. I also created a heap dump and I received a "problem suspected" message given below, which might be a memory leak I am not sure:
The thread java.lang.Thread @ 0x6000762b8 main keeps local variables with total size 5,578,598,448 (99.42%) bytes.
The memory is accumulated in one instance of
"org.eclipse.recommenders.jayes.factor.AbstractFactor[]" loaded by
"sun.misc.Launcher$AppClassLoader @ 0x600013810".
The stacktrace of this Thread is available. See stacktrace.
Keywordsorg.eclipse.recommenders.jayes.factor.AbstractFactor[]
sun.misc.Launcher$AppClassLoader @ 0x600013810
From total size I guess you can see how large Bayesian nets that I am dealing with and this is only one of the smallest one :)
As a solution I thought if using Apache Spark would solve this problem. I read that Spark supports many machine learning tools and they all have integration. So I was wondering if there is such existing integration or is it possible to use Bayesian nets on a distributed system. If it can be distributed, how can it be distributed to the slaves in a distributed system.
Best Regards,
Ekin