Community
Participate
Working Groups
I am unable to generate the p2 metadata when the plugins/ and features/ directories are present. The process consumes 100% CPU time, artifacts.jar grows and returns 10 zero bytes and it simply never ends. If I remove the plugins/ and features/ directories, it takes less than one second to do, but I'm not sure this is correct. I have tried multiple Java VMs on Linux x86_64 and Linux ppc, using Eclipse R-3.4.2 and S-3.5M6 with the same results. The actual command I'm using is: eclipse/eclipse -application org.eclipse.equinox.p2.metadata.generator.EclipseGenerator -updateSite /home/babel-working/live/output/galileo/ -site file:/home/babel-working/live/output/galileo/site.xml -metadataRepositoryName "Babel language packs update site" -append -reusePack200Files -vmargs -Xmx256m
It may be worthy to add that Babel's site.xml for Galileo is 178K, and there are 460 files in features/ and 26223 in plugins/
The plugins and features directories are needed. The generator reads all the plugin manifests and the feature definitions, and push all the info into artifacts.jar and contents.jar. I believe you should assign this bug to Pascal Rapicault, or at least ask him what next steps we should take regarding this problem.
Pascal, could you please take a look and advise? Thanks!
> artifacts.jar grows and returns 10 zero bytes and it simply never ends. That should be: artifacts.jar grows and returns to zero bytes cc'ing Gabe, as he was also investigating this.
It looks like a scalability problem in the P2 metadata generator. I've tried running the metadata generator as an Eclipse application, in debug mode, and discovered that it is saving the entire artifact repository once for each artifact, all 26000 of them. It thus hits this breakpoint over and over: Thread [main] (Suspended (breakpoint at line 877 in SimpleArtifactRepository)) SimpleArtifactRepository.save() line: 877 SimpleArtifactRepository.addDescriptor(IArtifactDescriptor) line: 309 Generator.publishArtifact(IArtifactDescriptor, File[], IArtifactRepository, boolean, File) line: 1344 Generator.publishArtifact(IArtifactDescriptor, File[], IArtifactRepository, boolean) line: 1333 Generator.generateBundleIUs(BundleDescription[], Generator$GeneratorResult, IArtifactRepository) line: 408 Generator.generate() line: 345 EclipseGeneratorApplication.generate(EclipseInstallGeneratorInfoProvider) line: 380 EclipseGeneratorApplication.run(EclipseInstallGeneratorInfoProvider) line: 364 EclipseGeneratorApplication.run(String[]) line: 332 EclipseGeneratorApplication.start(IApplicationContext) line: 386 EclipseAppHandle.run(Object) line: 194 EclipseAppLauncher.runApplication(Object) line: 110 EclipseAppLauncher.start(Object) line: 79 EclipseStarter.run(Object) line: 368 EclipseStarter.run(String[], Runnable) line: 179 NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method] NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57 DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 Method.invoke(Object, Object...) line: 616 Main.invokeFramework(String[], URL[]) line: 559 Main.basicRun(String[]) line: 514 Main.run(String[]) line: 1287 Main.main(String[]) line: 1263 Each time I hit that breakpoint, there is one more artifact in the artifacts.xml file. The for-loop which iterates through all the bundles is several stack levels higher, in Generator.generateBundleIUs(). Perhaps SimpleArtifactRepository.addDescriptor() should not perform an implicit save(), and Generator.generateBundleIUs() could tell the SimpleArtifactRepository to save() outside the for-loop. But I'm not sure if IArtifactRepository exposes the save() method, so there might be more to it than that. And of course it might not be safe to change the existing API. In any case, I suggest this bug be assigned to the Equinox team.
Reassigning to p2.
Downgrading severity, as it was relative to Babel. Sean: thanks for your analysis! p2 team: comment 0 and comment 5 contain all the meat and gravy here.
Changing this is non trivial and will not happen in this release. Note also that this code has not changed in months so there may be another reason to your problems. We also need to see if we have the same issue in the publisher (which I believe we do).
I spent a little bit of time trying to generate the meta-data on my local box. When the plugins and features folders were full the process didn't complete. So I tried reducing the size of the features folder to only one project (keeping all the different translation for it) and the process didn't complete. Then I tried reducing the number of file in the plugins folder and the process did complete (it took about 30 seconds with 3,000 plugins files out of 30,000 or so). This just confirms that there is an issue with the process of generating meta-data scaling, but we already knew this. One observations I had was that we are probably doing allot of redundant work having one copy of each project in the features folder for each languages Babel defines. It seems each of those files should be mostly the same? I am not sure how to avoid the extra work, but it seemed worth noting.
I have had a go at fixing this, although it might be better classed as a workaround. It seems to work: 40 seconds to generate the Babel metadata from site.xml, rather than forever. As an Eclipse plugin novice, I started by importing the relevant p2 plugins from the Plug-ins view "as source", inside my own Eclipse SDK instance. It might take me a little while to work out how to make a clean patch from my modified source. I'll tackle it tomorrow, but let me know if I'm missing some easy shortcut like "PDE/Generate patch against current platform". That would be nice.
>let me know if I'm missing some >easy shortcut like "PDE/Generate patch against current platform". There is. Right click on the project, Export > Deployable Plugins and Fragments, and then select "Install into host". This will create a patch and install it into the currently running platform.
I think Sean meant a cvs patch, not a feature patch to install his changes :) You will the real source from cvs: dev.eclipse.org:/cvsroot/rt/org.eclipse.equinox/p2/bundles/* You can use "Check Out As..." to check it out into a different project name. I tried selecting both the imported project and the cvs project and comparing them, but it just showed whole files as changed because of whitespace. Also, I couldn't see any way to make a patch that way.
http://wiki.eclipse.org/Equinox_p2_Getting_Started_for_Developers http://wiki.eclipse.org/Equinox/p2
Created attachment 135911 [details] Patch to fix metadata generator in Ganymede R34x Thanks for the patch generation tips; fortunately I didn't get formatting changes in the .java files, although I had to ignore a lot of other changes. This is the R34x version.
Created attachment 135912 [details] Patch to fix metadata generator in Galileo (based on 3.5M7) I tried using CVS HEAD, but the plugins I was changing depended on an update to another plugin I couldn't locate in CVS. So this is generated against 3.5M7; hopefully that will be pretty close. I hope we can get this in (or something like it) before too long. Babel langpacks don't work at all for me in 3.5, unless they come with P2 metadata. Either this bug, or bug 276306, probably deserves a higher severity than normal.
The approach looks promising. I have just looked at the patch (I didn't apply it, although I'm pretty sure it will apply) and it sets a property on the repository (IArtifactRepository.DISABLE_IMPLICIT_SAVE), and if this is set, the artifacts are not written out incrementally, but rather at the end. However, this patch is written against the generator, I wonder if we should also see how this affects the publisher.
My First comment is that the patch is an API change, and we are too late in 3.5 for that. Note that the explicitSave is not strictly necessary, SimpleArtifactRepository#setProperty() will do a save(). So this could instead work as follows: 1) setProperty(disableSave, true) 1b) implicit save() does not occur 2 Do stuff, save()'s don't happen 3) setProperty(disableSave, false) 3b) implicit save() does occur Here, anybody setting the property should really do it in a try with a finally block that removes the property. John, what do you think of this?
Andrew, if we can do this without adding a new method, then +1. However, is this really an API change? The docs say that extenders should extend the abstract repository (not the interface), so extenders would not have to change their code unless they wanted to handle the explicit save. (although maybe it turns out that most repo providers will have to implement this in practice).
I'm just asking out of curiosity. I think we should avoiding adding new methods to our interfaces if we can. (which appears to be the case as you suggest).
> However, is this really an API change? It is an API change, but not a breaking API change. We'd like to avoid or minimize even non-breaking API changes at this point in the release. Andrew's property-based suggestion sound promising though. The only new API in this case would be a new property constant.
I missed the javadoc that says: This interface is not intended to be implemented by clients. Artifact repository implementations must subclass {@link AbstractArtifactRepository} rather than implementing this interface directly. This would have been a bigger problem for people not extending AbstractArtifactRepository.
API or not API aside, at this point minimizing change like this is "a good thing" if possible. There needs to be good justification espeically in the face of alternatives.
(In reply to comment #22) > API or not API aside, at this point minimizing change like this is "a good > thing" if possible. There needs to be good justification espeically in the > face of alternatives. Is there an alternative way of generating p2 metadata which could be used by Babel? Or are you saying that the metadata generator (and the publisher, I suppose) could use the existing API in a more scalable way? About the only possibility that springs to mind is using the addDescriptor() which takes an array and saves to disk after processing the array. I suppose that might be workable, as long as the array doesn't get too big. Or is there a completely different API which could be used?
I have been able to generate the metadata for the Babel project. I was reading the help docs on generating P2 metadata [1] and altered the command Denis was using in comment #1 to this: eclipse/eclipse -nosplash -application org.eclipse.equinox.p2.metadata.generator.EclipseGenerator -source /shared/technology/babel/test-updates/galileo/ -metadataRepository file:/shared/technology/babel/test-updates/galileo/repository/ -metadataRepositoryName "My Update Site" -artifactRepository file:/shared/technology/babel/test-updates/galileo/repository/ -artifactRepositoryName "My Artifacts" -publishArtifacts -publishArtifactRepository -compress -noDefaultIUs -vmargs -Xmx256m While it took a long time to complete (6 1/2 hours) I was confident it would finish. The reason I felt it would complete is I could see it coping all the plugins (30,951 of them) from the 'source' directory into the 'repository' directory. Once the process finished I had 2 new files artifacts.jar (1.3 megs) and content.jar (4.5 megs). I have given the files generated to Kit to take a look at. If there is interest in looking at them I can attach them to this bug. [1] http://help.eclipse.org/stable/index.jsp?topic=/org.eclipse.platform.doc.isv/guide/p2_metadata_generator.html
> While it took a long time to complete (6 1/2 hours) I tried your command on the last Babel weekly build (for Galileo) and after 14 hours of intense crunching, it was about 70% complete, judging by the number of files in the repository/plugins directory. I ended up stopping it because this just generates tons of load on the build machine.
How many plug-ins and features do you have? I have a hard time understanding how this can take more time than doing large builds.
942 features, 41292 plugins.
(In reply to comment #24) > > I have given the files generated to Kit to take a look at. If there is > interest in looking at them I can attach them to this bug. I tried to use the metadata that I was able to generate locally (using the method I described in comment #24). After waiting for Eclipse to process the local repository for a few minutes, it came back saying there were no categories defined. The 'site.xml' file has categories (and 'category-def') defined. So even though I was able to generate the metadata, it doesn't look like it working. Any ideas?
Created attachment 137885 [details] Error trying apply second patch (In reply to comment #15) > Created an attachment (id=135912) [details] > Patch to fix metadata generator in Galileo (based on 3.5M7) > > I tried using CVS HEAD, but the plugins I was changing depended on an update to > another plugin I couldn't locate in CVS. So this is generated against 3.5M7; > hopefully that will be pretty close. > > I hope we can get this in (or something like it) before too long. Babel > langpacks don't work at all for me in 3.5, unless they come with P2 metadata. > Either this bug, or bug 276306, probably deserves a higher severity than > normal. > I took a stab at applying the second patch today. I checked out the P2 bundles (dev.eclipse.org:/cvsroot/rt/org.eclipse.equinox/p2/bundles/) into my workspace. When I tried to apply the second patch to the code, I got a few errors about files missing (see attached screen shot). So I was not able to apply the patch and test the generation of the metadata.
Created attachment 137943 [details] Patch to fix metadata generator in Galileo (based on HEAD) I've just used a fresh workspace to check out org.eclipse.equinox.p2.artifact.repository and org.eclipse.equinox.p2.metadata.generator from CVS HEAD and applied the M7 patch. (I'm running under RC3 now, so HEAD compiles okay.) First time around, it didn't seem to recognise it as a workspace-rooted patch the first time (which meant it couldn't find any of the files), but it's not doing that now (?!). (As a workaround, I applied the patch twice: once for each project, ignoring 1 leading path segment.) Anyway, I've now re-generated the patch against HEAD, and Eclipse isn't having any trouble applying it, reversing it, etc. According to the error message you got, the relevant projects weren't in your workspace. Could they have been given different project names? I'll add a team project set.
Created attachment 137945 [details] Team project set which will check out the plugins affected by the patch
(In reply to comment #28) > I tried to use the metadata that I was able to generate locally (using the > method I described in comment #24). After waiting for Eclipse to process the > local repository for a few minutes, it came back saying there were no > categories defined. The 'site.xml' file has categories (and 'category-def') > defined. > > So even though I was able to generate the metadata, it doesn't look like it > working. Any ideas? Going on http://wiki.eclipse.org/Equinox_p2_Metadata_Generator#Arguments_describing_the_input , I'm not sure you will get categories unless you use the "-site" option. Understanding how the command line options interact is a bit of a black art. It is to me, anyway.
(In reply to comment #32) > (In reply to comment #28) > > I tried to use the metadata that I was able to generate locally (using the > > method I described in comment #24). After waiting for Eclipse to process the > > local repository for a few minutes, it came back saying there were no > > categories defined. The 'site.xml' file has categories (and 'category-def') > > defined. > > > > So even though I was able to generate the metadata, it doesn't look like it > > working. Any ideas? > > Going on > http://wiki.eclipse.org/Equinox_p2_Metadata_Generator#Arguments_describing_the_input > , I'm not sure you will get categories unless you use the "-site" option. > Understanding how the command line options interact is a bit of a black art. > It is to me, anyway. > Funny you should mention the -site option. I was reading that same page a few days ago and I tried to generate the metadata with this command (pointing site to the site.xml file): eclipse/eclipse -nosplash -application org.eclipse.equinox.p2.metadata.generator.EclipseGenerator -source /shared/technology/babel/test-updates/galileo/ -site /shared/technology/babel/test-updates/galileo/site.xml -metadataRepository file:/shared/technology/babel/test-updates/galileo/repository/ -metadataRepositoryName "My Update Site" -artifactRepository file:/shared/technology/babel/test-updates/galileo/repository/ -artifactRepositoryName "My Artifacts" -publishArtifacts -publishArtifactRepository -compress -noDefaultIUs -vmargs -Xmx256m But when I pointed P2 to my local update site it still complained about having no categories defined.
I must have been at the right place and the right time yesterday, because I intercepted this gem on the rt-pmc list: "The RAP team was using the old metadata generator which does not have p2.inf support. They switched to the publisher (all the cool kids are doing it...) and are happy now." [1] The same p2 publisher is mentioned in a blog post today: http://eclipsesource.com/blogs/2009/06/04/who-writes-your-metadata-me-i-use-a-publisher/ So ... In order for Babel to be cool kids, should we be using the cool Publisher for Babel, instead of the old, outdated metadata generator? Does this bug become moot? [1] http://dev.eclipse.org/mhonarc/lists/rt-pmc/msg00706.html
(In reply to comment #8) > We also need to see if we have the same issue in the publisher (which I believe > we do). > Good question Denis :-). I imagine the problem exists there too. (This bug is even tagged [publisher]).
I was doing some testing and it appears this is still a problem in the publisher. I'll mark this M1.
Ian, I'm bumping these again to M3, but please review these bugs and pick a reasonable target (or remove the target if you have no immediate plans to work on them).
I'm looking at this bug now, and I think we should look at how we handle the metadata, and do that for the artifacts. In the publisher, we queue up all the IUs and write them all at once (at the end), so we only have one write operation. However, for the artifacts, we add them to the repo every time we find one. Each time we add an artifact we reload the XML, so 10's of thousands of bundles takes a bit of time. That approach may be a bit more involved than Sean's approach, but I think it fits with our existing API a bit better. thoughts?
I don't think that metadata and artifacts can be handled the same way as artifacts have to carry actual bytes. Also I am worried about the explicit save model for the artifact repository and the impact this has on backward compatibility, robustness (e.g. what happens when the process is killed in the middle of a save). Now stepping back from the problem, it seems to me that with the addition of this API we are palliating for a poorly scalable implementation and I'm wondering if we would not be better off fixing the implementation of the artifact repository.
(In reply to comment #39) > I don't think that metadata and artifacts can be handled the same way as > artifacts have to carry actual bytes. Also I am worried about the explicit save > model for the artifact repository and the impact this has on backward > compatibility, robustness (e.g. what happens when the process is killed in the > middle of a save). > Now stepping back from the problem, it seems to me that with the addition of > this API we are palliating for a poorly scalable implementation and I'm > wondering if we would not be better off fixing the implementation of the > artifact repository. Any ideas on what to fix? Simple artifact (and metadata) repositories invoke a save each time something is added to them. This helps with transactions, doesn't scale. To help scale, we suggest using the add(SometType[] ) method, were a bunch of things are added at once. Other than the two options discussed here: 1. Explicit save 2. Queue up everything and write them at once Do you see others? We have a patch for #1, but I agree, explicit save has problems. I'll take a stab at #2 and see what problems fall out.
Created attachment 150051 [details] org.eclipse.equinox.p2.artifactrepository.patch Here is a patch that uses a runnable to execute all the artifact updates without saving until the end. This patch is not done yet, but I'm putting it here so I don't loose it. I will commit this to the branch this week.
Created attachment 150052 [details] mylyn/context/zip
This patch improves the performance of the publisher, although I'm not seeing the 40 sec times that Sean was seeing. Just generating the MD5 hashes for thousands of bundles takes a few minutes. However, I notice that Sean's patch was against the generator, and maybe we weren't doing MD5 hashes then. Here are some numbers (on about 4K bundles) Before (with artifacts) 917 sec After (with artifacts) 575 sec Before (w/o artifacts) 755 sec After (w/o artifacts) 329 sec and if I remove the MD5 hashes I get this down to 122 sec. Now for the actual patch. It works by adding an #executeBatch(Runnable process) method to IArtifactRepository. There are a few questions 1. Should this be on IRepository? 2. How do we handle errors a. Should save() happen in a finally block (should we always try to save, even if the runnable fails) b. If yes, what happens if save() (also) throws an exception (out of disk space for example)? What exception should be propagated to the caller? (the one from original process, or the one from save())? For example try { throw new NullPointerException(); } finally { throw new CastClassException(); } As a caller, what exception are you more interested in handling? (or does it not matter).
(In reply to comment #43) > b. If yes, what happens if save() (also) throws an exception (out of disk > space for example)? What exception should be propagated to the caller? (the > one from original process, or the one from save())? You never want a finally block to throw an exception, since this will cause any original exception to be completely lost. Often if there is a second exception in the finally block it is a side-effect of the original failure, so you don't want to lose that. I see a few options here: 1) Transactional semantics: only save if the runnable succeeds, and if there is an exception any change to the repository made by the runnable is discarded. 2) Best effort semantics: Always save, but catch & log any exception thrown by the save itself 3) Combined result: Catch exceptions from the runnable, and catch exceptions from the save, and return a MultiStatus with the combined result. I suggest using SafeRunner in this case to allow severe errors to propagate immediately. I don't feel strongly but would lean towards 3)
I would also lean towards (3). If you've run a batch of several thousand adds, and one fails, it would be a big waste to not save the successes.
(In reply to comment #44) > You never want a finally block to throw an exception, since this will cause any > original exception to be completely lost. Often if there is a second exception > in the finally block it is a side-effect of the original failure, so you don't > want to lose that. I see a few options here: Thanks for the explanation John. This makes perfect sense. > > 1) Transactional semantics: only save if the runnable succeeds, and if there is > an exception any change to the repository made by the runnable is discarded. I considered this but it can't really be done. Since we don't know what was done during the runnable, it will be hard to undo it. In our current implementation if someone added a bunch of artifact descriptors and then something failed, these descriptors would still be "added". The next repository operation would likely invoke a save, and that might not be what's desired. Essentially we left the artifact repo in an unstable state. > > 2) Best effort semantics: Always save, but catch & log any exception thrown by > the save itself This is what I was thinking. > > 3) Combined result: Catch exceptions from the runnable, and catch exceptions > from the save, and return a MultiStatus with the combined result. I suggest > using SafeRunner in this case to allow severe errors to propagate immediately. > I hadn't thought of this, and now that you mention it, it seems ideal. > I don't feel strongly but would lean towards 3) I'll put together a solution for #3. Thanks again for the input.
A few more interesting questions. 1. I looked at using SafeRunner but I'm not really sure what that will accomplish? Do you mean that the caller should pass in a SafeRunnable? If so, then we really can't catch any exceptions and pass them back in a status (since the caller will already receive these exceptions). 2. Where should this code live? When I started, I put it in SimpleArtifactRepository and CompositeAartifactRepository. The AbstractArtifactRepository simply calls run on the runnable (it doesn't muck around with saving). However, another option would be provide a setSaveEnabled(boolean) and a save() method on the abstract class. The code could then live in the AbstractArtifactRepository and clients would be free to implement save / setSaveEnabled(boolean) as they see fit. (by default, they do nothing). 3. What about a batch process on Metadata Repositories?
Created attachment 150882 [details] org.eclipse.equinox.p2.artifact.repository.patch This patch wraps the calling of the runnable in a try / catch / finally, and returns any exceptions as an IStatus. This patch also includes some test cases with repositories that fail in different ways.
Created attachment 150883 [details] mylyn/context/zip
- SimpleArtifactRepository#save(), we should short-circuit on the first line - Wonder if we should not catch Throwable rather than just Exception when we execute the Runnable. - Describe the restrictions on the Runnable (for example the Runnable should not execute any operations on the repository in another thread). - Consider adding a lock at the file level since we are now opening a bigger window during which the repository can be modified. Maybe we want to treat this in another bug. Tests - Need a test that actually do several adds and remove and save - Need a test that validates that the save flag is set back properly
> 3. What about a batch process on Metadata Repositories? This could be a nice addition for sake of consistency.
I'm going to drop this in as soon as head re-opens after M5 is declared.
Created attachment 157998 [details] updated patch This patch is now updated wrt to head (after all the API changes). I still need to add a few more test cases and documentation before we can release.
Created attachment 158248 [details] updated patch This patch addresses most of Pascal's concerns. 1. We now short circuit on the first line 2. Catch throwable 3. Document the method 4. Added a bunch of tests (including multiple adds, remove, with and without exceptions and checking the flag). The only thing left is the file level lock. We can open another bug for that as it's somewhat unrelated (I'm not sure if the actual file open time is any longer, we still only open/write/close at the end).
Released in head. I've opened Bug 301902 to add the API to metadata repo and bug 301903 to consider the file level lock.
Babel update site has 45,714 plugins (language fragments). Babel UpdateSitePublisher generation time now reduced from 46230 seconds (~13 hours) to 75 seconds, unbelievable! Tested Babel Galileo update site with Eclipse 3.5.2 RC2, everything is fine! Thanks!
(In reply to comment #56) > Babel update site has 45,714 plugins (language fragments). Babel > UpdateSitePublisher generation time now reduced from 46230 seconds (~13 hours) > to 75 seconds, unbelievable! > Wow! Ok, that's just cool. Thanks for letting me know about that.
Woah, that is awesome.