Community
Participate
Working Groups
Created attachment 237462 [details] log from trying to install everything at candidate M3 site I did an "install everything" test from candidate M3 site, starting with Eclipse SDK S-4.4M3-201310302000 and installing everything at "staging" site (except runtime components). Actually I used the repo staging is copied to for composite, if that matters. http://download.eclipse.org/releases/luna/201311150900/ Appeared to take an extra long time installing everything, but worse, would not startup after that, getting a "stack overflow" error (java.lang.StackOverflowError). Will attach log: is the one from original install everything run, and then clicking "restart now" when that was complete. Tried just "re-running" eclipse, but same result. Not sure if pure volume issue, or if there is something that is cause an infinite recursive call.
I did even try using -clean, but same result :) I've set to "major" for now, since "installing everything" is (hopefully) rare, but might be a "blocker" if found to be due to some particular (smaller set) combination of (valid) import/exports, and not just sheer volume.
I will investigate
Doing the install now. I selected everything except the EclipseRT target components. I assume that is the same as David did. p2 detected an issue and searched for alternatives. This took many minutes. Then it detected that it could not install a set of things (about 7-10) but offered to install the rest. I am letting that complete now. But while I waited I wanted to see if David saw the same thing.
(In reply to Thomas Watson from comment #3) > Doing the install now. I selected everything except the EclipseRT target > components. I assume that is the same as David did. p2 detected an issue > and searched for alternatives. This took many minutes. Then it detected > that it could not install a set of things (about 7-10) but offered to > install the rest. I am letting that complete now. But while I waited I > wanted to see if David saw the same thing. Yes, some stuff in ACTF, I think, is "windows only" ... I was using Linux. In my original test, I was using M3 + latest I build and saw some messages in that p2 resolution that in effect said "not installing JDT, SDK (or other platform features) because a more recent version is already installed). I repeated the test with "pure" M3, just to make sure that wasn't related ... and then only saw the ACTF feature filtered out as "not installable due to filters" or similar msg.
Here is one thing that is causing endless recursion. http://git.eclipse.org/c/sirius/org.eclipse.sirius.git/tree/plugins/org.eclipse.sirius/META-INF/MANIFEST.MF#n164 org.eclipse.sirius requires itself AND reexports itself! org.eclipse.sirius;visibility:=reexport I have to say I have never imagined ever wanting to do that! What in the world does it mean!
I opened bug 421765 for the sirius manifest issue. The framework should prevent the endless recursion, but at this point I'm not sure what should be done. One option that seems the best to me is to fail installation of a bundle that requires itself. I don't think the spec is clear that requiring yourself is not allowed though so that could break folks unexpectedly.
(In reply to Thomas Watson from comment #5) > Here is one thing that is causing endless recursion. > > http://git.eclipse.org/c/sirius/org.eclipse.sirius.git/tree/plugins/org. > eclipse.sirius/META-INF/MANIFEST.MF#n164 > > org.eclipse.sirius requires itself AND reexports itself! > > org.eclipse.sirius;visibility:=reexport > > I have to say I have never imagined ever wanting to do that! What in the > world does it mean! Probably just a typo, though might have been thinking of "importing what you export", http://blog.osgi.org/2007/04/importance-of-exporting-nd-importing.html but I believe that literally just applies to packages, not 'require bundle".
A new Sirius build with the offending line removed should be available in a few minutes (see https://hudson.eclipse.org/sirius/job/sirius-master/61/). I hope this is enough to fix the immediate issue, and will investigate how we got in this situation and why it went unnoticed until now (we never had any installation problems with Sirius until now).
*** Bug 421801 has been marked as a duplicate of this bug. ***
Just to keep folks informed. I have a few simple options to fix the endless recursion that happens with the siruis scenario from comment 5. But there is still other issues I need to work out for very large resolve operations. When everything is installed at the same time we have about 2800 bundles resolving at the same time. There are a number of uses constraint issues that are being found and attempts to solve, but the set of options have exploded the algorithm in the felix resolver. This is taking up loads of unexpected heap and processor time for me and resulting in OOM errors now. It is likely I will need to split the resolve into chunks instead of doing a big bang resolve. That is what I am investigating now for M4.
The javax.annotations bundle is really introducing lots of uses conflicts. I finally got an approach that allows the large system to complete the resolve without running out of memory, but there are still lots of unsolvable issues when trying to apply the update to a system that includes installing javax.annotations bundle (particularly in papyrus). This makes for some really long startup times until -clean is used to allow all importers to wire to the javax.annotations bundle.
I released a number of fixes for M4 but need more testing and verification once M4 is closing down. Moving to M5 for more testing and improvements.
We noticed that org.eclipse.birt.jetty.overlay (which is a fragment contributing jetty packages to system bundle) also causes an important Heap consumption in Felix Resolver (creating a lot of org.apache.felix.resolver.Candidates) which leads to an OOM while starting Eclipse. The exact same platform without this fragment works fine. More background at https://issues.jboss.org/browse/JBIDE-15807
(In reply to Mickael Istria from comment #13) > We noticed that org.eclipse.birt.jetty.overlay (which is a fragment > contributing jetty packages to system bundle) also causes an important Heap > consumption in Felix Resolver (creating a lot of > org.apache.felix.resolver.Candidates) which leads to an OOM while starting > Eclipse. The exact same platform without this fragment works fine. > More background at https://issues.jboss.org/browse/JBIDE-15807 I opened bug 422176 to ask what this fragment is for. I have no idea why they have a system bundle fragment that exports jetty packages. No response from them though. Have you tried with the latest I-Build?
Testing updating from I20131119-0800 I-Build or ealier to the latest I-Build I cam across an issue where several fragments ended up unresolved. This is because the persisted meta-data for the fragments are missing the new equinox.fragment capability which is needed to locate fragments for on demand resolving. The new equinox.fragment namespace got introduced in commit: http://git.eclipse.org/c/equinox/rt.equinox.framework.git/commit/?id=4eefdb7c23063b4f79b05619160879fe61f1613a But I neglected to make sure the meta-data we are acting upon has this new capability for fragments. Instead of hacking in the capability in order to "fix" the persisted meta-data I decided to simply increment the version of the persisted meta-data which forces a clean operation of the osgi configuration area. http://git.eclipse.org/c/equinox/rt.equinox.framework.git/commit/?id=34dd34042093037aa0e72bcfc4a2cb1a9e316f36
(In reply to Thomas Watson from comment #14) > Have you tried with the latest I-Build? Using I20131203-0800 and out target platform: Heaps goes to 1.2 GB when org.eclipse.birt.jetty.overlay is present vs 350MB when it's not.
(In reply to Mickael Istria from comment #16) > (In reply to Thomas Watson from comment #14) > > Have you tried with the latest I-Build? > > Using I20131203-0800 and out target platform: Heaps goes to 1.2 GB when > org.eclipse.birt.jetty.overlay is present vs 350MB when it's not. I tested out your scenario and found a couple more bugs, but not sure it will reduce the overall heap required here or not: There was a bug in the felix code that would discard capabilities from fragments in some cases. Fixed with: http://git.eclipse.org/c/equinox/rt.equinox.framework.git/commit/?id=4eb5b1a47e314d9d73239d294360b427bd946e57 With that felix bug fix I had to fix a bug in equinox code that was returning "resolved" hosts for already resoled fragments which really messes with the felix resolver. Fixed with: http://git.eclipse.org/c/equinox/rt.equinox.framework.git/commit/?id=ca804e697a08ccbf8f2b1e206b33a3334e3fa4da With these two fixes I don't see 1.2 GB being used, but I have not done real measurements of the heap. Only going off the activity monitor on Mac.
Thanks Thomas. Ping me when you'd like me to run the same scenario on a newer build. FYI, I use VisualVM to monitor the Heap Size when application is running.
(In reply to Mickael Istria from comment #18) > Thanks Thomas. Ping me when you'd like me to run the same scenario on a > newer build. > FYI, I use VisualVM to monitor the Heap Size when application is running. It would be great if you could try on the latest I-Build I20131209-2000. But I think the heap will likely still grow to resolve org.eclipse.birt.jetty.overlay. But it should get GC'ed after the resolve operation finishes. At least that is what I found with your jboss tools scenario. I could solve part of that issue if I allowed unresolved providers to get preferred over resolved ones with lower versions. This would correctly wire most importers to the real jetty bundles instead of the strange birt.jetty.overlay one. But this would go against specification and also hurt in scenarios with bundles that have substitutable exports (export and import the same package).
(In reply to Thomas Watson from comment #19) > It would be great if you could try on the latest I-Build I20131209-2000. > But I think the heap will likely still grow to resolve > org.eclipse.birt.jetty.overlay. But it should get GC'ed after the resolve > operation finishes. At least that is what I found with your jboss tools > scenario. I just tried it and had the same behaviour. Because application was taking 1.2GB of RAM, it was too slow and I didn't have time to let it continue until if calls the Garbage Collector. Without jetty.overlay, still 330MB consumed (which is acceptable given the amount of stuff in the target application).
Created attachment 238302 [details] log from failed attempt to install all from "staging" This test/run/log may not be that useful, since it is just against "staging" repository for M4 (i.e. not everything is "up to date" ... for example, this "staging repo" still has the "jetty overlay" in it) but using our M4 candidate, I20131211-2000, eclipse still won't start after "installing everything". But, thought I'd attach the results here in case any of the error messages in the log are useful to you to spot other problems under "extreme conditions". I'll try again once "jetty overlay" is no longer present. (Also, I did not try using -clean ... just wanted to try a "quick test" ... but, at least, no "stack overflow" -- in fact, not sure why it did not start ... seemed something ended up interfering with the framework itself?).
@David: the issue you see in log aren't Equinox issue caused by jetty.overlay, but more inconsistency in some projects (namely EGF and JWT).
(In reply to Mickael Istria from comment #22) > @David: the issue you see in log aren't Equinox issue caused by > jetty.overlay, but more inconsistency in some projects (namely EGF and JWT). I'm curious to know what the inconsistencies are that cause the issue. Do you have some more insight? There are lots of class not found for internals from the old framework. Namely AbstractBundle: For example: java.lang.NoClassDefFoundError: org/eclipse/osgi/framework/internal/core/AbstractBundle at org.eclipse.egf.core.platform.internal.pde.PlatformBundle.<init>(PlatformBundle.java:60)
Created attachment 238347 [details] install all and restart log without jetty overylay jetty*overlay no longer seems to be in .../releases/staging, and the log file is not much better (and Eclipse still won't start) after "installing everything". (And, I know, "M4 is not done" ... but in case some "gross" errors can be spotted to be sure fixed in M4 would be good). [I can see a few minor things to open bugs on (such as stardust singleton) ... but all the "wiring traces" are hard to read.]
Created attachment 238348 [details] install all and restart log without jetty overylay apologies ... previous was from wrong directory ... this is the one I meant to attach.
Created attachment 238349 [details] install all and restart log using -clean This is same scenario and install as previous "long" log, but started eclipse with -clean. At least the log is shorter ... maybe it will be easier to understand and "attack". Unfortunately, what ever is going wrong still prevents Eclipse from starting! Let me know if I can help further in any way. (Even if, you find these useless and want me to stop attaching them :)
(In reply to David Williams from comment #26) > Created attachment 238349 [details] > install all and restart log using -clean > > This is same scenario and install as previous "long" log, but started > eclipse with -clean. At least the log is shorter ... maybe it will be easier > to understand and "attack". Unfortunately, what ever is going wrong still > prevents Eclipse from starting! > > Let me know if I can help further in any way. (Even if, you find these > useless and want me to stop attaching them :) No it is useful, just may take me time to get to investigating it all. I did find that modisco has some bad reprovide=true attribute (bug 424150) that is causing many of the resolver errors in modisco.
I opened bug 424151 to document/discuss the fact that some interim headers/attributes are no longer supported in Luna. This is causing many of the resolution issues.
This one should be fixed now.