Bug 408944 - Binary diff between all bundles in eclipse-SDK-4.3 and eclipse-standard-4.3
Summary: Binary diff between all bundles in eclipse-SDK-4.3 and eclipse-standard-4.3
Status: RESOLVED WORKSFORME
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: Cross-Project (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows 7
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Nobody - feel free to take it CLA
QA Contact:
URL:
Whiteboard:
Keywords: info
Depends on: 408213
Blocks:
  Show dependency tree
 
Reported: 2013-05-24 05:35 EDT by Martin Oberhuber CLA
Modified: 2015-06-08 08:26 EDT (History)
7 users (show)

See Also:


Attachments
Diff output comparing extracted JAR contents (1.78 KB, text/plain)
2013-05-24 06:47 EDT, Martin Oberhuber CLA
no flags Details
List of non-eclipse.org-bundles with identical JAR contents (1.65 KB, text/plain)
2013-05-24 06:53 EDT, Martin Oberhuber CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Oberhuber CLA 2013-05-24 05:35:53 EDT
+++ This bug was initially created as a clone of Bug #408213 +++

Build ID: 20130523-2015_eclipse-classic-kepler-RC1-win32.win32.x86.zip

Here is the next episode in the "Binary diff" saga ... apparently things have improved and bug 408213 can be closed, but still each and every plugin of the Eclipse classic package is different than the respective bundle from the Eclipse Platform SDK, eclipse-SDK-4.3RC1-win32.zip .

- The good news: While JAR files binary differ, extracted data from the JAR's
  (as well as the signatures and the Manifest) are identical ! I've looked 
  at org.eclipse.compare_3.5.400.v20130514-1224.jar with winzip, Oracle Java7
  "jar" and Java7 "jarsigner" on Windows.

- The bad news: Although extracted data is identical, the JAR's differ binary.
  I have no idea how this could happen ?

  - platform/SDK: 754,042 bytes, timestamp 5/17/2013 12:21 AM
  - standard/SDK: 754,402 bytes, timestamp 5/23/2013 04:30 PM

- The ugly news: There is one file which also differs in its extracted variant:
  eclipse/plugins/org.junit_4.11.0.v201303080030/META-INF/ECLIPSE_.RSA

  - platform/SDK: 7,816 bytes, timestamp 5/10/2013 03:20 PM
  - standard/SDK: 7,816 bytes, timestamp 3/08/2013 08:47 AM

Regarding the issue with the JUNIT ECLIPSE_.RSA , it looks like EPP standard/SDK pulls a different version of that bundle from the Kepler p2 repository, contributed either by an older build of the Platform or by some other project "overwriting" the proper version.

I'm not sure how to further investigate and what steps can be taken. I've filed the bug against EPP initially because I think that even with non-perfect Platform deliveries or p2 procedures, the EPP project should be able to name exactly the same build input for a milestone that the Platform also used, and thus produce exactly identical builds.

I'm reducing severity to "major" at this time since at least the extracted bytes seem to be identical now (well I didn't test all bundles so that still remains to be proven).
Comment 1 Martin Oberhuber CLA 2013-05-24 06:47:54 EDT
Created attachment 231447 [details]
Diff output comparing extracted JAR contents

Besides Windows, I have also compared the Linux x86_64 packages and run some scripts to extract all the jars and perform binary diff on the extracted contents.

Results attached - it looks like the "META-INF/ECLIPSE_.RSA" file differs in all those bundles that come from Orbit; no other diffs found.

The Orbit R-Build was declared on 17-May-2013, same day as when the Platform SDK was built. Could it be that Platform SDK did not pull in the Orbit R-Build, but EPP did ? Then the 2nd part of the issue should be resolved with the next Platform build.

This still leaves the question why literally _all_ bundles show different binary bits even with the same Jarred up contents. Apparently different code (or different host OS, or different JVM) are used when post-processing or publishing the bits. This needs to be investigated; we should make sure that we have our processes under control at Eclipse, and end users get the same binary bits from Eclipse regardless of download channel.
Comment 2 Martin Oberhuber CLA 2013-05-24 06:53:21 EDT
Created attachment 231448 [details]
List of non-eclipse.org-bundles with identical JAR contents

Correction - on closer look, there are also a lot of bundles from Orbit that have identical JAR contents (with ECLIPSE_.RSA identical). The list of those bundles is attached. For the records, all the Eclipse.org bundles also have identical contents.

Perhaps it helps finding some pattern of the issue...
I'm running out of ideas, Markus / David can you help ?
Comment 3 Markus Knauer CLA 2013-05-24 08:43:59 EDT
Thanks for this investigation, it is really interesting! (And a bit scary...)

So far I've only two explanations: 

- It could have to do with some pack200 issues, different JVMs, etc.?
- There are different projects contributing the same version of the bundle to the Kepler p2 repo.

The p2 director that assembles the packages uses the "do-not-use-any-mirrors" flag and is configured with the /releases/staging repository only.

Maybe we should try a "find . -name my.suspect.bundle_x.y.z.jar" on the file system and find out if (and which) projects contribute the bundles that differ on a binary level.
Comment 4 David Williams CLA 2013-05-25 15:03:40 EDT
I've opened bug 409072 which might help eliminate some of the variability. 

But, not sure it will ever, literally, remove all "binary differences". That is because the pack200 and unpack200 processed don't guarantee "exact binary bits and layout" ... just "functionally equivalent byte codes". 

And, yes, some of the orbit differences might have had to do with the timing of when I promoted the final Orbit R build (in time for others to use, but we in platform didn't use it). One improvement that could be made in Orbit releng process it to have a more refined, tighter "comparator" process. We do use it, but our "reference repo" is typically "old", so the signature can change time stamp from build to build. Something I've been hoping to do for the past year :) See bug 379738. 

Lots to do ... to achieve your high standards! :)
Comment 5 Martin Oberhuber CLA 2013-05-27 13:25:18 EDT
(In reply to comment #4)
Thanks David and Markus, but I'm still concerned ...

- The Orbit change to the R-Build can only affect bundles from Orbit, but none
  of the normal Eclipse Platform's bundles (which also all show a binary diff).

- Turning off a comparator in the Eclipse Platform's build scripts can only 
  impact the bits that Eclipse Platform delivers to simrel and EPP ... but 
  once the bits are there, Eclipse Platform SDK should build from the same 
  bits, or am I missing something ?

I'm not sure how to proceed here. Eclipse Platform has delivered the RC2 bits. But /releases/staging/ is still frozen to RC1 bits ? Can we even do a test build as of today ? Would it make sense to compare the Platform's bits on 
/releases/staging against the Platform's bits on the repository that the SDK is built from ?
Comment 6 David Williams CLA 2013-05-27 20:19:47 EDT
(In reply to comment #5)
> (In reply to comment #4)

> 
> - Turning off a comparator in the Eclipse Platform's build scripts can only 
>   impact the bits that Eclipse Platform delivers to simrel and EPP ... but 
>   once the bits are there, Eclipse Platform SDK should build from the same 
>   bits, or am I missing something ?

The Eclipse Platform SDK as delivered on our DL page, is not built "from the same bits" as delivered to common repo ... at least, not quite literally, when that post comparator step was in place [I mean, in theory, should have been same bits .... but there is a chance it was picking up equivalent, but different bundle. See bug 409072 for details]. That fix was not in "RC2" and is only fixed as of I20130526-2000.   

> I'm not sure how to proceed here. Eclipse Platform has delivered the RC2
> bits. But /releases/staging/ is still frozen to RC1 bits ? 

I have update staging to RC2 content now.
Comment 7 David Williams CLA 2013-06-02 17:46:20 EDT
For RC2, I took one install, eclipse RC2 and eclipse standard rc2 for linux 64 bit, and did the sort of "diffs" that I think Martin has been doing.

Then did a 'diff -r' on plugins/. (I removed ones that were really really supposed to be different egit related in 'standard' and 'eclipse.sdk' in regular eclipse). 

And, sure enough, saw very many "binary differences" (in both orbit and eclipse jars). 

I then wrote a small script to unzip each jar (and removed the jar) and then when I did a 'diff -r' there were no differences at all. 

So, not sure how to explain "binary differences" ... but seems only two possibilities. One is the there might always be binary differences depending on what jar/zip program is used (just due to slight differences in exactly how they zip things up, how much they compress, etc.) but more likely factor in this case is how pack200 and unpack200 work ... after "unpack200", the jar might always show "binary differences" ... again, just due to the exact way different versions of unpack200 and gzip works, etc? 

I think our eclipse "packages" are produced directly from the jars produced by Tycho (though, they would have been through pack200 conditioning and signing of course ... and I would assume "we" are using the pack200 that comes with Java 7 (/shared/common/jdk1.7_11 since using that to drive our main build) though ... not positive ... would have to check Tycho code to see how they pick which to use) ... and the b3 aggregator actually "unpacks" the pack.gz files into jars (rather than copy the original jars) and there I think we are using Java 6 to "run that process" (again, would have to check to know for sure). I'm guessing that EPP uses those jar files directly (since on same file system) rather than fetching and unpacking the pack.gz files.

So, good news is all is functionally identical ... bad news is, still don't know exact source of "binary differences" but hopefully my comments give you some details to look at, if you were so inclined. 

I think only "bad thing" about binary differences is that is makes it harder for release engineers to confirm "content is the same" ... and they'd have to use work-around of unzipping to confirm.
Comment 8 Martin Oberhuber CLA 2013-06-04 13:34:37 EDT
It is really interesting ...

I took another look at these two, which should match:
  - eclipse-standard-kepler-RC2-win32.zip
  - eclipse-SDK-4.3RC2a-win32.zip

In the "plugins/" folder, most of the "*.source" jars do match binary, except for those coming from Orbit. Very few jars with *.class file contents do match (org.apache.lucene.core, org.apache.lucene.analysis, org.eclipse.jdt.debug.ui) but most do not match. Doc bundles do not match.

I had a deeper look at a small and simple one from orbit, which does NOT match: 
  - javax.el_2.2.0.v201303151357.jar
The file sizes, time stamps and ZIP CRC's all do match exactly (checked with 7-zip).

The only hypothesis that I can offer at this point is, that the two JAR's have zipped up the contents in different ORDER or that the ZIP file format allows for some random padding somewhere. Both cases would imply that somebody Unjars / Rejars those bundles during the various releng stages ... which should not happen IMO when a Jar is already packed and signed !

I think we can rule out any effect of pack200 when the ZIP CRC's are identical. The binary diff must occur in the ZIP file format, not anywhere below.

Does that help anybody ?
Comment 9 Martin Oberhuber CLA 2013-06-04 13:58:00 EDT
Looking at the binary diff of "javax.el_2.2.0.v201303151357.jar" reveals that the difference is at offset 6 of the Local File Header of each entry. The first 8 bytes go like this for the two candidates respectively:

   P  K  03  04  14  00  08  00  08    (epp-standard-*)
   P  K  03  04  14  00  08  08  08    (eclipse-SDK-*)

Bytes 7,8 are the so-called "General Purpose bit flag". The difference is in bit 11, which is SET for eclipse SDK and UNSET for EPP. Section 4.4.4 here gives the details:
   http://www.pkware.com/documents/casestudies/APPNOTE.TXT

   Bit 11: Language encoding flag (EFS).  If this bit is set,
           the filename and comment fields for this file
           MUST be encoded using UTF-8. (see APPENDIX D)

I don't think that it is relevant for the behavior of any programs whether that bit is set or unset, since I assume that Java's ZipInputStream and similar have always used UTF-8 since the beginning regardless of whether the bit is set or not.

But as Dave says, having the binary diff there makes it a bit inconvenient for release engineers; and, knowing that somebody apparently unpacks/repacks the Jar's means that processing power is wasted here somewhere. So rather than trying to use identical "jar" programs everywhere, it would be more interesting to understand why and where the jar's get unpacked / repacked.
Comment 10 Martin Oberhuber CLA 2013-06-04 14:19:56 EDT
Last experiment:

I installed eclipse-platform-4.3RC2a-win32.zip and then using Help > Install New... got JDT from the build input, which the Eclipse Platform had delivered for Simrel RC2a: 
http://download.eclipse.org/eclipse/updates/4.3milestones/S-4.3RC2a-201305262000/

The resulting JDT plugins were DIFFERENT to eclipse-SDK but IDENTICAL to the EPP eclipse-standard package.

I think this prooves that Eclipse Platform/Releng submits something DIFFERENT to the simrel than it packs into its own SDK. Some unjar / rejar operation is performed ... maybe due to pulling the ".pack.gz" instead of the ".jar" when installing with update manager (I assume that the b3 aggregator does the same).

I'm reassigning the bug to Eclipse Platform / Releng since I think that's where things can be changed ... perhaps by assembling the Platform SDK from .pack.gz just like a normal install would do.
Comment 11 Martin Oberhuber CLA 2013-06-04 14:22:35 EDT
Just one theory here based on what Dave said:

If Jar's are originally produced with Java7, the EFS bit might be set to indicate UTF-8 filenames. But if then Java6 is used to perform pack/unpack/repack from .pack.gz, the Java6 tools might unset the EFS bit since they didn't know it at that time.

Over to Dave at this point ...
Comment 12 Martin Oberhuber CLA 2013-06-04 14:24:27 EDT
Reducing severity from "major" to "normal" since I understand that with Kepler, most initial downloads will happen from the EPP packages. Based on my investigation, it looks like all packages will be identical and only the "original eclipse SDK" from the Platform team will be different. 

Since very few users will get access to that with Kepler, it's most likely not that much of an issue (and more a cosmetic thing for release engineers, how they assemble the SDK).
Comment 13 David Williams CLA 2013-06-04 16:38:36 EDT
I think "cross-project" is better. Partially because I don't think its the result of anything we in platform do or can change. And partially to make everyone aware. 

I did explore the idea of using Java 7 to run the aggregator, which is what takes the pack.gz files and converts them to jars for common repo ... but, that didn't work, because there's still some projects with nested pack.gz files, and Java 7 doesn't handle them. 

I know that the b3 aggregator pretty much uses "just" p2 API to do its work. 

Less sure what Tycho uses, but if it's more than simply everyone using the same VM, I'd say it's in the places that pack and unpack the jars. 

But, as you say, it may simply be due to difference of Java 6 and Java 7 and how they handle "zip files". And since no "functional difference", the important thing at this point it to make sure everyone is aware that "binary difference" does not necessarily mean "a real difference". 

Thanks for the investigation and links to references.
Comment 14 David Williams CLA 2013-06-04 16:40:57 EDT
assigning to self, just to avoid as many notifications being sent ... but, I'm not working on "solving" this, so if anyone else has any ideas, feel free to say.
Comment 15 David Williams CLA 2013-06-04 16:48:00 EDT
webmasters tip of the day: you can also use "nobody@eclipse.org" :)
Comment 16 David Williams CLA 2013-06-12 08:33:37 EDT
Changing title to emphasize that all the differences that could be accounted for have been fixed: all that remains is the Language encoding flag (EFS) Martin found in comment 9 which is due to a difference of behavior in zip input stream in Java 6 vs. Java 7. (A "fix" to properly handle multi-language file names). 

To cross-reference, this "change in behavior" of zipped streams, is the same issue at the root of bug 361628.
Comment 17 David Williams CLA 2013-06-17 14:51:59 EDT
For what its worth, I found "bigger" differences in one of our Eclipse bundles, that is built by Tycho and opened bug 410948 to discuss the differences found. 

The javax.el bundle that Martin analyzed is from PDE build in Orbit, and is essentially untouched by Tycho (and Eclipse) buidl, so there I think the "language bit" is due to the different version of Java used ... but ... the "Tycho produced" jar seems to have more differences, so thought it deserved separate discussion. Note, the differences are still only in the "zip metadata" ... if unzipped, the results are indeed identical ... but as Martin said, ideally we would have binary identical jars too.
Comment 18 Eclipse Genie CLA 2015-06-08 01:22:41 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.
Comment 19 David Williams CLA 2015-06-08 08:26:07 EDT
Marking as "worksforme" to signify the diff's found are "not a bug" ... but limitations of "diff" and varieties compression tools and formats.