Bug 226850 - Numerous errors while packing "SHA1 digest error for META-INF/eclipse.inf"
Summary: Numerous errors while packing "SHA1 digest error for META-INF/eclipse.inf"
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Update (deprecated - use Eclipse>Equinox>p2) (show other bugs)
Version: 3.4   Edit
Hardware: PC Windows XP
: P3 major (vote)
Target Milestone: 3.4 M7   Edit
Assignee: Platform-Update-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-04-13 20:33 EDT by David Williams CLA
Modified: 2008-04-28 16:22 EDT (History)
5 users (show)

See Also:


Attachments
Summary list of jars that cause exception to be thrown (103.05 KB, text/plain)
2008-04-13 21:17 EDT, David Williams CLA
no flags Details
full log during the run, showing skipped, successful, plus the "error 1" lines. (801.70 KB, text/plain)
2008-04-13 21:29 EDT, David Williams CLA
no flags Details
test case that demonstates problems (1.20 MB, application/octet-stream)
2008-04-15 02:10 EDT, David Williams CLA
no flags Details
a new, smaller test case (50.00 KB, application/octet-stream)
2008-04-15 16:47 EDT, David Williams CLA
no flags Details
simple patch that works for me (939 bytes, patch)
2008-04-16 04:16 EDT, David Williams CLA
no flags Details | Diff
patch (3.17 KB, patch)
2008-04-16 15:51 EDT, Andrew Niefer CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description David Williams CLA 2008-04-13 20:33:57 EDT
There's a hundreds of these in current ganymede staging site, when I've tried packing with siteOptimizer, and I've just now seen on our WTP milestone staging site (though, don't recall seeing before). 

So, I'm opening here in releng hoping some insight can given. 
 
Using google, I've seen some discussions from the 3.3 release where these were related to some some windows vs. linux issues with end-of-lines, but doubt that's the problem here, or, at least not all of it. The reason I think that is that there are some from WTP, and we do not use windows in any part of our build process. Plus, there's an odd pattern, from WTP ... none of our "WST" level features or plugins are effected, but several (but not all) of our "JST" level ones exhibit the exception. 



[createDigestJob] STDERR: Exception in thread "main" java.lang.SecurityException: SHA1 digest error for META-INF/eclipse.inf
[createDigestJob] STDERR:       at sun.security.util.ManifestEntryVerifier.verify(ManifestEntryVerifier.java:253)
[createDigestJob] STDERR:       at java.util.jar.JarVerifier.processEntry(JarVerifier.java:225)
[createDigestJob] STDERR:       at java.util.jar.JarVerifier.update(JarVerifier.java:212)
[createDigestJob] STDERR:       at java.util.jar.JarVerifier$VerifierStream.read(JarVerifier.java:438)
[createDigestJob] STDERR:       at java.io.InputStream.read(InputStream.java:112)
[createDigestJob] STDERR:       at com.sun.java.util.jar.pack.Package$File.readFrom(Package.java:789)
[createDigestJob] STDERR:       at com.sun.java.util.jar.pack.PackerImpl$DoPack.readFile(PackerImpl.java:529)
[createDigestJob] STDERR:       at com.sun.java.util.jar.pack.PackerImpl$DoPack.run(PackerImpl.java:490)
[createDigestJob] STDERR:       at com.sun.java.util.jar.pack.PackerImpl.pack(PackerImpl.java:91)
[createDigestJob] STDERR:       at com.sun.java.util.jar.pack.Driver.main(Driver.java:279)
Comment 1 David Williams CLA 2008-04-13 21:17:48 EDT
Created attachment 95844 [details]
Summary list of jars that cause exception to be thrown 

I assume the "temp" prefix on jars is some part of the internal processing of the jars by site optimizer?
Comment 2 David Williams CLA 2008-04-13 21:23:00 EDT
Another odd aspect of this is that if I go back to the original jars from WTP, then most of them pass "jarsigner -verify" test. There was as few I noticed failing (though in past, when setting up the system, I would have sworn there were none, but maybe missed them). 

But, most odd thing is, is that some that are throwing the error on Ganymede staging do _not_ through the error on the original WTP "tempTestUpdates" site. 

So ... I don't think it's only a project problem? (Or, else I'm just seeing things wrong, after staring at it too much). 

Comment 3 David Williams CLA 2008-04-13 21:29:08 EDT
Created attachment 95845 [details]
full log during the run, showing skipped, successful, plus the "error 1" lines. 

The lines starting with "Error 1 was returned .." are the places where the exception is thrown, but those are logged in std err (and are all identical, and just like the one in the first comment in this bug). 

Note, these are runs made from my own "private" test area ... but, I think the "official" runs will be the same.
Comment 4 David Williams CLA 2008-04-13 21:31:31 EDT
Marking as major since having so many failed zips will cause increased band width requirements (i.e. I'd consider it "missing function"). 

If it turns out there's a lot of invalidly signed jars, that'd be even worse, probably "blocker". 
Comment 5 David Williams CLA 2008-04-13 21:34:01 EDT
Bjorn, Denis, have either of you seen this issue before? 

Denis, just out of curiosity, which VM is used to do the signing? 5 or 6? 

Comment 6 David Williams CLA 2008-04-13 21:36:11 EDT
Oh, and I should have documented, it's Eclipse 3.4 M5 that's used during the processing, at least the Ganymede part of the processing. (Is there anything that would require projects to use one or, the other, though we in WTP also use 3.4 M5). 

Comment 7 Denis Roy CLA 2008-04-14 08:51:15 EDT
> Denis, just out of curiosity, which VM is used to do the signing? 5 or 6? 

Right now signing is:
java version "1.5.0"
Java(TM) 2 Runtime Environment, Standard Edition (build pxp32devifx-20071025 (SR6b))
IBM J9 VM (build 2.3, J2RE 1.5.0 IBM J9 2.3 Linux ppc-32 j9vmxp3223-20071007 (JIT enabled)
J9VM - 20071004_14218_bHdSMR
JIT  - 20070820_1846ifx1_r8
GC   - 200708_10)
JCL  - 20071025


I noticed there's Java 6 on there ... should I be using it to sign instead? 

Comment 8 Bjorn Freeman-Benson CLA 2008-04-14 10:13:47 EDT
(In reply to comment #5)
> Bjorn, have either of you seen this issue before? 

No, I have not seen this problem before.
Comment 9 Kim Moir CLA 2008-04-14 10:17:42 EDT
Regarding comment 7, no ... see

https://bugs.eclipse.org/bugs/show_bug.cgi?id=179315#c14
https://bugs.eclipse.org/bugs/show_bug.cgi?id=179315#c15


I haven't seen these errors before, I'm adding Andrew - he may have suggestions
Comment 10 Andrew Niefer CLA 2008-04-14 14:04:59 EDT
(In reply to comment #2)
> But, most odd thing is, is that some that are throwing the error on Ganymede
> staging do _not_ through the error on the original WTP "tempTestUpdates" site. 

These jars that do not throw the error on tempTestUpdates will be interesting.
It is the answer to a question I have asked more than once whenever pack200 issues come up :)

So, given some jar that is failing with the digest error exception in ganymede:
1) At the end of the WTP build, after conditioning and signing, is this jar valid?
  - You are saying that yes, jarsigner -verify succeeds.
2) Are the input jars in the ganymede staging valid?
  - The exception implies no, but it would still be interesting to run jarsigner -verify on the input jar directly instead of the temporary one.

3) What is the difference between those 2 jars?  Can you attach some here?
4) How did the jars get from the WTP build (tempTestUpdates?) to the staging area?
5) Are there pack.gz files at tempTestUpdates? (May or may not be relevant)
Comment 11 David Williams CLA 2008-04-14 16:29:02 EDT
(In reply to comment #10)
> (In reply to comment #2)

Some of this will take more investigation and careful logging on my part ... so, if I don't seem to answer your question, I will eventually :) 

> 
> So, given some jar that is failing with the digest error exception in ganymede:
> 1) At the end of the WTP build, after conditioning and signing, is this jar
> valid?
>   - You are saying that yes, jarsigner -verify succeeds.

Yes, I think in most (but not all) cases. 
All have verified in the past, but a quick check now showed some were not. 
Is jarsigner -verify something we should do after every signing run? To make sure it worked? (I had been assuming that "once it worked, it would always work"). 

> 2) Are the input jars in the ganymede staging valid?
>   - The exception implies no, but it would still be interesting to run
> jarsigner -verify on the input jar directly instead of the temporary one.
> 

No, I didn't check carefully, but I think those that throw the exception also do not verify, on that eclipse.inf file. 

> 3) What is the difference between those 2 jars?  Can you attach some here?

Will soon, once I track down and double check things. 

> 4) How did the jars get from the WTP build (tempTestUpdates?) to the staging
> area?

For the most part, update manager is used to create one site from others. 
But, at some points in testing, moving around, etc., rsync might be used. 

> 5) Are there pack.gz files at tempTestUpdates? (May or may not be relevant)
> 

I've re-tried with completely empty directory (in case there was some issue with existing, old stuff, but got same result). So, the pack.gz files only exist as they are created. 

Comment 12 David Williams CLA 2008-04-15 02:10:57 EDT
Created attachment 96041 [details]
test case that demonstates problems

Good news and bad news. The attached tar (a little over a meg) is a "slimmed down" set of jars that shows the problem. I tried re-creating the problem on individual jars, but could not. So ... next I tried removing jars (to get a smaller set) and ended up with this set. (It could maybe go smaller, but think it's good representation of the problem, without taking too long). 

I think it is an especially significant example set because as it is there's a number of jars where the exception is thrown. However, if I remove just one of these feature jars ... the org.eclipse.pde feature, then they all work just fine. So, I think this demonstrates some odd "interaction" effect?! 

Now, the bad news, I moved the tar to another linux machine, that had x86 arch, and the problem did not manifest itself (with or without the pde feature). So ... I'm wondering if there is a part of siteOptimizer that depends on "low level" file system optimizations that are not implemented on PPC ... you know, PPC always prints this message: 
Could not load library: liblocalfile_1_0_0.so.  
This library provides platform-specific optimizations for certain file system operations.  
This library is not present on all platforms, so this may not be an error.  
The resources plug-in will safely fall back to using java.io.File functionality.

I'm hoping others can try to reproduce with this tar file, both on PPC and x86 system. (There's a script, testFeatures.sh where you may need to set a few variables, such as where Java is installed on your system).
Comment 13 David Williams CLA 2008-04-15 16:47:25 EDT
Created attachment 96168 [details]
a new, smaller test case

Here's a little smaller test case. Only two jar files involved. 

For short, I'll refer to one as the emf plugin and the other as the tptp plugin.

If both are in a directory _and_ if the emf one is processed first, then the exception is throw when processing the tptp one. If the emf one is removed, the tptp one is packed fine. And, if the tptp one is processed first, then they both pack fine. 

The problem is, in order for others to reproduce, is how to determine/influence which is picked first to execute? 
create time? access time? Something else? 
On my original (i86) machine, the error occurred, so I tarred up and moved to another i386 machine, and could not get the error to occur, because the tptp jar was always picked first. I experimented with 'touch', but couldn't seem to affect the order in which they were processed.
Comment 14 Andrew Niefer CLA 2008-04-15 17:13:11 EDT
Thanks David, that last test case is exactly what I need to debug something like this.  

I expect the order is os & vm dependent, I can probably fake it with the debugger.

Note that the jar processor uses pure java.io so that it can run standalone, the resources plug-in has no affect here.
Comment 15 David Williams CLA 2008-04-15 20:07:44 EDT
I've noticed a few odd things. When the emf plugin "comes first", it specifies E4 in it's eclipse.inf file, but the tptp has no such "override" in it's eclipse.ini. 
So, for one, I think the former's "E4" value is being used in processing the second file, which presumably was conditioned with E5. 

Second, why is the pack processing modifying the eclipse.inf file at all ... wouldn't that always be doomed for signature failures? 
Comment 16 David Williams CLA 2008-04-16 04:16:07 EDT
Created attachment 96210 [details]
simple patch that works for me

From what I can see, there's an 'arguments' variable that's used as both an instance variable, and as a local variable. I think the local variable was intended to be the same as the instance variable (and someone just turned off the PDE warnings about doing such dangerous things :) 

This bug wouldn't show up in a "homogeneous" run, where everyone used the same settings, but is just the kind of bug that shows up "cross projects", since its there the different parameters show up, and the previous value was "left over" from the previous file processing (instead of being reset since the only the local variable was modified). 

Running this version fixed all the 100's of problem I'd seen. 

There still was one error, in one of the ICU bundles, but my guess is that's something different. Perhaps a true windows/linux line ending issue. 

And, and I do _still_ think that in the best fix, the pack logic should _never_ modify the _content_ of the eclipse.inf (or, any other file) but maybe I'm missing the point. With this patch, at least the replaced content is exactly equal to what it was. 

Also, I have no idea if this "hidden side effect" of the local variable might change the processing for conditioning or signing ... so, hopefully, there's some good unit test cases for that.
Comment 17 David Williams CLA 2008-04-16 04:17:44 EDT
Comment on attachment 96041 [details]
test case that demonstates problems

marking as obsolete, since it's kind of big and should be no longer a need to use (though, it is still valid).
Comment 18 David Williams CLA 2008-04-16 08:11:17 EDT
FYI, I've opened bug 227316 about the ICU jar problem (it is different, says the SHA1 digest error is for Assert class. ). 

Comment 19 Andrew Niefer CLA 2008-04-16 15:51:12 EDT
Created attachment 96327 [details]
patch

Attached is a new patch.  I have released this to HEAD.  David did identify the source of the problem, but his fix was not correct.  I have removed the instance variable completely and tweaked the inf adjustment to only happen if the jar was not previously conditioned.

I have released this fix to HEAD for both org.eclipse.update.core and org.eclipse.equinox.p2.jarprocessor.
Note that the p2.jarprocessor is the new home for this code going forward and build teams will want to switch over at some point.

The p2.jarprocessor does not currently provide an application like siteOptimizer from update.core.  It does provide a Main class and can be started with something like:
java -cp plugins/org.eclipse.equinox.p2.jarprocessor_0.1.0.v20080226-2139.jar org.eclipse.equinox.internal.p2.jarprocessor.Main.

David can you give this a try and if all is good hassle Bjorn/Denis to update the ganymede build once an IBuild is available with the changes.
Comment 20 Martin Oberhuber CLA 2008-04-16 18:13:48 EDT
I doubt that Bjorn would want to update the fully Ganymatic basebuilder with an I-build but perhaps he would consider running the p2.siteOptimizer as standalone Jar from some well-known location.
Comment 21 David Williams CLA 2008-04-17 11:32:13 EDT
Thanks Andrew, I've confirmed the patch fixes the problem with packing heterogeneous directories. 

> and if all is good hassle Bjorn/Denis to update
> the ganymede build once an IBuild is available with the changes

I'd never dream of hassling Bjorn or Denis :) 
but, I can update where needed. It's currently used in a fairly isolated fashion, so should not effect anything else. 

But, just to be explicit, there's nothing here that has to be coordinated with projects using re-pack (conditioning) or signing, right? (That is, projects don't need to update to new code prior to getting into Ganymede staging area, right?). 

Comment 22 David Williams CLA 2008-04-28 02:31:58 EDT
Has this (or, can this) make it into M7? 

Comment 23 Andrew Niefer CLA 2008-04-28 10:45:04 EDT
The fix has been in update.core and p2.jarprocessor since I20080422-0800.
For a full fix for ganymede, both the eclipse.org signing process and anyone running the siteOptimizer need to upgrade their jarprocessor.jar or update.core as appropriate.
Comment 24 Bjorn Freeman-Benson CLA 2008-04-28 10:59:07 EDT
(In reply to comment #23)
> The fix has been in update.core and p2.jarprocessor since I20080422-0800.

It would have been nice to have this bug annotated with that fact.
Comment 25 Andrew Niefer CLA 2008-04-28 11:51:50 EDT
(In reply to comment #24)
> (In reply to comment #23)
> > The fix has been in update.core and p2.jarprocessor since I20080422-0800.
> 
> It would have been nice to have this bug annotated with that fact.
> 

That was comment #19 where the intent was "it is fixed in HEAD, you will want to get the next IBuild."
Comment 26 Bjorn Freeman-Benson CLA 2008-04-28 12:03:14 EDT
(In reply to comment #25)
> That was comment #19 where the intent was "it is fixed in HEAD, you will want
> to get the next IBuild."

Maybe I speak for many, maybe I speak for just myself, but quite honestly, not being a member of the Platform team, I don't know when the "next IBuild" happens. I was waiting for a "it's been fixed in build xxx".
Comment 27 David Williams CLA 2008-04-28 12:18:25 EDT
It's certainly easier if build number given, or even if jar attached, for a specialized case like this (which I guess now is in builds) ... but what confused me was that the bug is still in 'new' state, instead having been marked 'fixed', so I feared perhaps it was forgotten to be released to build. 
Comment 28 Andrew Niefer CLA 2008-04-28 14:22:39 EDT
Sorry, I didn't mark this bug fixed since it is a releng bug about running the optimizer on the ganymede build.  That won't be fixed until that process is updated to use the new code.  

So in a way this did get forgotten about if you were waiting for me and I considered my part finished :)

Comment 29 Kim Moir CLA 2008-04-28 16:22:37 EDT
I have opened bug 229162 to update the platform releng builder to use the new jar.  
I have opened bug 229167 to update the Ganymede builder to use this new jar.

Moving this bug to update since this was where the code was actually fixed and closing it.
	

Comment 30 Kim Moir CLA 2008-04-28 16:22:58 EDT
closing.