Bug 545159 - Lot of time spent in TargetPlatformBundlePublisher computing checksums
Summary: Lot of time spent in TargetPlatformBundlePublisher computing checksums
Status: NEW
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Tycho (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on: 545145
Blocks:
  Show dependency tree
 
Reported: 2019-03-07 04:48 EST by Sebastian Ratz CLA
Modified: 2021-04-28 16:51 EDT (History)
3 users (show)

See Also:


Attachments
NetBeans profile traces (84.64 KB, application/x-zip-compressed)
2019-03-07 08:07 EST, Sebastian Ratz CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sebastian Ratz CLA 2019-03-07 04:48:42 EST
Since Bug 423715, p2 uses SHA-256 checksums in addition to MD5. They are much slower than MD5.

This affects Tycho 1.2.0+ and the "Computing target platform for MavenProject" step in a Tycho build is now much slower than it was before.

In our large repository (~300 bundles) this adds about 30 minutes of build time just for the dependency resolution phase before the actual build even starts. 

In TargetPlatformBundlePublisher we should disable checksum generation.

From what I can tell, TargetPlatformBundlePublisher uses the full-blown p2 publisher just to get basic p2 properties (ID+version) of a maven artifact. Calculating checksums seems unnecessary here.
Comment 1 Eclipse Genie CLA 2019-03-07 04:50:10 EST
New Gerrit change created: https://git.eclipse.org/r/138253
Comment 2 Mykola Nikishov CLA 2019-03-07 06:26:49 EST
> From what I can tell, TargetPlatformBundlePublisher uses the full-blown
> p2 publisher just to get basic p2 properties (ID+version) of a maven
> artifact. Calculating checksums seems unnecessary here.
My concern is the attemptToPublishBundle's comment:

> This is required because the resulting metadata can be included in p2
> repositories built by Tycho, and hence may be propagated into the p2
> universe. Therefore the metadata generated by this method shall
> fulfill the basic assumption of p2 that ID+version uniquely
> identifies a unit/artifact.
From my reading, this would generate metadata (target/p2*.xml files)
without any checksums at all. And, then, these metadata will be
propagated into the p2 universe...
Comment 3 Mickael Istria CLA 2019-03-07 07:03:32 EST
I'm not sure about the diagnostic. 30 additional minutes for 300 modules seems crazy just to compute a few additional checksums; and I'm afraid removing a checksum generation (and then downstream checks) can more easily lead to generation of corrupted content.
Do you think you could create a simple "reproducer" and profile it so we can verify whether it's SHA-256 taking effectively much longer than MD5, or see whether it's another issue that might be hidden?
Comment 4 Sebastian Ratz CLA 2019-03-07 07:04:07 EST
Neither target/p2artifacts.xml nor target/p2content.xml files are written in the "Computing target platform for MavenProject:" phase of the build.

They are only written when the actual build starts.

And even without my patch neither of those files in my build result contained a checksum.

So I am not sure attemptToPublishBundle's comment is even correct.
Comment 5 Sebastian Ratz CLA 2019-03-07 08:06:45 EST
I cannot really share a real reproducer as this is our internal ABAP Development Tools repository.

I am already struggling with debugging/profiling things myself.

My setup is currently:
- An Eclipse IDE with Tycho sources included so I can attach via remote-debugger to my maven execution
- A NeatBeans installation so I can profile the maven execution

I have attached the Netbeans traces for Tycho 1.0.0, Tycho 1.4.0 (with 2019-03 p2), and Tycho 1.4.0 (with 2019-03 p2 and checksums disabled).
Comment 6 Sebastian Ratz CLA 2019-03-07 08:07:23 EST
Created attachment 277795 [details]
NetBeans profile traces
Comment 7 Mykola Nikishov CLA 2019-03-07 08:31:09 EST
> I'm not sure about the diagnostic. 30 additional minutes for 300
> modules seems crazy just to compute a few additional checksums

Checksum generation reads file twice (for MD5 and SHA-256) and this alone may increase build times dramatically if reading from the filesystem is slow (i.e., real-time antivirus checks or in virtualized environment). https://bugs.eclipse.org/bugs/show_bug.cgi?id=545145#c8

I've re-opened bug #545145 against p2.
Comment 8 Mykola Nikishov CLA 2019-03-07 10:19:57 EST
(In reply to Sebastian Ratz from comment #0)
> Since Bug 423715, p2 uses SHA-256 checksums in addition to MD5. They are
> much slower than MD5.
There is a regression caused by https://git.eclipse.org/r/125062:
- MD5 calculated twice
- as you've discovered, reading the file every time in ChecksumProducer.produce(File, String) just adds insult to injury.

I'm preparing fix and fill file bug against p2.
Comment 9 Sebastian Ratz CLA 2019-03-08 08:13:45 EST
The 30 minutes were not correct, my local maven repository must have been out-of-date and the extremely slow p2 of tycho 1.3.0 was used, sorry about that.

The real additional time spent I am seeing now is quite consistent with the analasis:
1 x MD5
vs.
2 x MD5 + 1 x SHA-256
with SHA-256 being quite a bit slower than MD5.

As for disabling checksum generation:
Do you think it would be reasonable to introduce a switch that could toggle this?
That way, e.g. Gerrit test builds could be *much* faster, as no artifacts are deployed.

Also: I am trying to understand where the checksums generated by TargetPlatformBundlePublisher actually end up. As written in comment #4 I don't see those checksums anywhere in the written p2*.xml files. Am I missing something?
Comment 10 Mickael Istria CLA 2019-03-08 10:22:31 EST
(In reply to Sebastian Ratz from comment #9)
> The 30 minutes were not correct, my local maven repository must have been
> out-of-date and the extremely slow p2 of tycho 1.3.0 was used, sorry about
> that.

Ok, good to know.

> The real additional time spent I am seeing now is quite consistent with the
> analasis:
> 1 x MD5
> vs.
> 2 x MD5 + 1 x SHA-256
> with SHA-256 being quite a bit slower than MD5.

Do you have an estimation of how much the extra 1*MD5+1*SHA256 actually costs in your build then? This number will greatly affect the priority.
Comment 11 Sebastian Ratz CLA 2019-03-12 08:43:02 EDT
I haven't really gone into details about the specific time spent.

Instead, I am still trying to understand the purpose of the checksums calculated by TargetPlatformBundlePublisher.

As a follow-up to comment #9 I have applied my patch to skip checksum calculation and then ran a full build.

The result -- our updatesite -- still contained valid md5 and sha256 checksums i the final artifacts.xml!

So I would appreciate it if somebody could shed some light on *why* we need the checksum calculated for the dependency resolution phase at all.
Comment 12 Felix Otto CLA 2019-04-03 12:54:34 EDT
I have tested the build with and without the generation of the checksum. Even having the checksum calculation switched off, all generated update sites (artifacts.xml) still contained checksums (md5 and sha256).

Applying the patch proposed by Sebastian, makes the plain build for our whole repository (~600 modules, clean done before, without tests, up to phase verify) two times faster (23 minutes vs. 11 minutes).
Comment 13 Mykola Nikishov CLA 2019-04-04 08:07:35 EDT
(In reply to Sebastian Ratz from comment #9)
> Also: I am trying to understand where the checksums generated by
> TargetPlatformBundlePublisher actually end up. As written in comment #4 I
> don't see those checksums anywhere in the written p2*.xml files.
I still have concerns about attemptToPublishBundle's javadoc (comment #2) and the purpose of p2*.xml files and what are they used for in the wild.

> Am I missing something?
There is also bug 437896 (and bug 461851 against p2) to address the same issue.

According to bug 357513, checksums are explicitly recreated for packaging type eclipse-repository and that is why all generated p2 repositories contain artifact checksums (md5 and sha256). This is my main concern regarding artifact checksums.

I think it is safe to apply this change - it's scope is limited to Tycho, solves the problem for Tycho and has no impact on other p2 clients.
Comment 14 Mickael Istria CLA 2019-04-04 08:23:05 EDT
The checksums are properties of installation units. One "rule" for best results with p2 is that the installation unit metadata gets created only once, with the desired final immutable set of properties from the very beginning, and that those properties shouldn't be altered nor regenerated at any time but should instead be propagated to consumer (like p2 repository or p2 director).

Whatever the current state of Tycho is, those checksums should be generated in the p2artifacts.xml for each module. If it's not then it's a bug and should be fixed. Some flag could control whether those should be skipped and let people who have 0 interest in checksums skip them, but I don't think it's a realistic use-case.
The eclipse-repository packaging shouldn't re-generate the checksums, it should just copy them from the initial installation units when assembling the metadata repository.
This approach should generate checksums only once, when generating p2 metadata for each module.

For inside Tycho, if we want to minimize the cost of computing checksums, we may want to introduce a way to disable checksum generation during some steps of the resolution. I don't know which steps are the most expensive here. Provided that the checksum are well propagated from module to eclipse-repositoty, we could disable checksum verification (not generation) all along, except when building the p2 repository or products (so we guarantee that content that was built by the module is actually what gets into repo).
Comment 15 Sebastian Ratz CLA 2019-04-25 07:20:16 EDT
We have some new insights:

After some more digging into this we noticed that TargetPlatformBundlePublisher is actually only relevant for pom dependencies declared as <dependency> in a pom.xml (called from org.eclipse.tycho.p2.resolve.PomDependencyProcessor).

In our repository we had a large number of such pom artifacts declared as dependency in our parent .pom, even though they are only required by a very small number of bundles.

But since the pom dependencies were considered in our build, they were also considered for the ~800 other bundles that do not even require them. Each time TargetPlatformBundlePublisher was ran on them (and wasting time uselessly).

By moving these pom dependencies to the few bundles that actually do require them, we were now able to avoid this performance bottleneck.

So for a pure manifest-based setup this MD5 calculation does not actually seem to be an issue.
Comment 16 Mickael Istria CLA 2021-04-08 18:10:14 EDT
Eclipse Tycho is moving away from this bugs.eclipse.org issue tracker to https://github.com/eclipse/tycho/issues/ instead. If this issue is relevant to you, your action is required.
0. Verify this issue is still happening with latest Tycho 2.4.0-SNAPSHOT
  if issue has disappeared, please change status of this issue to "CLOSED WORKFORME" with some details about your testing environment and how you did verify the issue; and you're done
  if issue is still present when latest release:
* Create a new issue at https://github.com/eclipse/tycho/issues/
  ** Use as title in GitHub the title of this Bugzilla ticket (may include the bug number or not, at your own convenience)
  ** In the GitHub description, start with a link to this bugzilla ticket
  ** Optionally add new content to the description if it can helps towards resolution
  ** Submit GitHub issue
* Update bugzilla ticket
  ** Add to "See also" property (up right column) the link to the newly created GitHub issue
  ** Add a comment "Migrated to <link-to-newly-created-GitHub-issue>"
  ** Set status as CLOSED MOVED
  ** Submit

All issues that remain open will be automatically closed next week or so. Then the Bugzilla component for Tycho will be archived and made read-only.