Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[recommenders-dev] [Jayes] Jayes and Apache Spark Integration

Hi everyone,

I have been working with Jayes about 3 years and I love it. I work with thousands of nodes and sometimes I have nodes with lots of parent nodes. 

I have modified my heap memory size and I have reached my limits. I also created a heap dump and I received a "problem suspected" message given below, which might be a memory leak I am not sure:

The thread java.lang.Thread @ 0x6000762b8 main keeps local variables with total size 5,578,598,448 (99.42%) bytes.

The memory is accumulated in one instance of "org.eclipse.recommenders.jayes.factor.AbstractFactor[]" loaded by "sun.misc.Launcher$AppClassLoader @ 0x600013810".

The stacktrace of this Thread is available. See stacktrace.


Keywords
org.eclipse.recommenders.jayes.factor.AbstractFactor[]
sun.misc.Launcher$AppClassLoader @ 0x600013810

From total size I guess you can see how large Bayesian nets that I am dealing with and this is only one of the smallest one :)

As a solution I thought if using Apache Spark would solve this problem. I read that Spark supports many machine learning tools and they all have integration. So I was wondering if there is such existing integration or is it possible to use Bayesian nets on a distributed system. If it can be distributed, how can it be distributed to the slaves in a distributed system.

Best Regards,

Ekin


Back to the top