Bug 568446 - parallelize dependency resolving
Summary: parallelize dependency resolving
Status: RESOLVED FIXED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Tycho (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Julian Honnen CLA
QA Contact:
URL:
Whiteboard:
Keywords: noteworthy, performance
Depends on:
Blocks:
 
Reported: 2020-11-02 08:23 EST by Julian Honnen CLA
Modified: 2021-04-28 16:51 EDT (History)
3 users (show)

See Also:


Attachments
CPU Sample of build (439.55 KB, application/octet-stream)
2020-11-02 08:23 EST, Julian Honnen CLA
no flags Details
CPU sample with parallel resolveProject loop (387.32 KB, application/octet-stream)
2020-11-02 08:24 EST, Julian Honnen CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Julian Honnen CLA 2020-11-02 08:23:53 EST
Created attachment 284632 [details]
CPU Sample of build

We're currently transitioning our project from PDE Build to Tycho and are pleasantly surprised by the performance of parallel builds.
The initial resolving dependencies step is not parallel though and consumes a large chunk of the total build time.

On my machine when building the project (~1800 modules + target platform) with mvn package -T 32, DefaultTychoResolver::resolveProject takes 360s (1/3 of the build). See the attached cpu sample.

I had a cursory look at the code and it seems like resolveProject runs independent per project making the loop in TychoMavenLifecycleParticipant::afterProjectsRead trivial to parallelize (I'll submit a patch).
Running resolveProjects also with 32 threads brings it down to 50s.
Comment 1 Julian Honnen CLA 2020-11-02 08:24:55 EST
Created attachment 284633 [details]
CPU sample with parallel resolveProject loop
Comment 2 Eclipse Genie CLA 2020-11-02 09:24:05 EST
New Gerrit change created: https://git.eclipse.org/r/c/tycho/org.eclipse.tycho/+/171626
Comment 3 Eclipse Genie CLA 2020-11-03 02:31:58 EST
New Gerrit change created: https://git.eclipse.org/r/c/tycho/org.eclipse.tycho/+/171655
Comment 4 Mickael Istria CLA 2020-11-03 05:30:58 EST
@Julian: is the time actually spent at resolving bundles, or at downloading dependencies? Is your patch specifically intended to parallelize download of dependencies in afterProjectsRead?
I'm working on some way to delay the download of dependencies (bug 567760) so that they're not fetched in this step (unless some packaging type like eclipse-plugin explicitly requests them early). The code actually has many locations where is expect dependencies to be already fetched and that are likely to turn into requests for fetching. So it looks like if the main goal is to make dependency download being parallel-able, similar change should happen at several locations.
Comment 5 Julian Honnen CLA 2020-11-03 06:37:33 EST
(In reply to Mickael Istria from comment #4)
> @Julian: is the time actually spent at resolving bundles, or at downloading
> dependencies?
I don't see any download in the sample. I assume all dependencies are already downloaded, as it's not the first build.
The build is using a p2 <repository>, not a target definition.

The majority of the time is spent in p2 resolution and another big chunk in resolveClasspath.
Comment 6 Mickael Istria CLA 2020-11-03 06:39:43 EST
(In reply to Julian Honnen from comment #5)
> I don't see any download in the sample. I assume all dependencies are
> already downloaded, as it's not the first build.
> The build is using a p2 <repository>, not a target definition.
> The majority of the time is spent in p2 resolution and another big chunk in
> resolveClasspath.

OK good, so those 2 areas of work are not conflicting.
Comment 8 Mickael Istria CLA 2020-11-04 06:00:19 EST
We usually try to keep "noteworthy" tickets opened until the notes are added to the N&N. Can you please add a comment about it to https://wiki.eclipse.org/Tycho/Release_Notes/2.2#New_and_Noteworthy ?
Comment 9 Julian Honnen CLA 2020-11-04 06:28:15 EST
Already done, it's awaiting moderation.
Comment 10 Mickael Istria CLA 2020-11-04 07:24:57 EST
(In reply to Julian Honnen from comment #9)
> Already done, it's awaiting moderation.

OK. Unfortunately, the wiki doesn't notify about changes to moderate. I did merge your change so your further edits to the wiki should happen automatically without moderation now.
Comment 11 Alessio Di Sandro CLA 2020-11-06 13:16:24 EST
I'm getting a ConcurrentModificationException in my build process after this change.
Check it here: https://travis-ci.com/github/adisandro/MMINT/builds/198626260
Comment 12 Mickael Istria CLA 2020-11-06 13:37:03 EST
(In reply to Alessio Di Sandro from comment #11)
> I'm getting a ConcurrentModificationException in my build process after this
> change.
> Check it here: https://travis-ci.com/github/adisandro/MMINT/builds/198626260

Would you be able to submit a patch to avoid that?
Comment 13 Julian Honnen CLA 2020-11-09 04:15:29 EST
The build uses a global LocalArtifactRepository (and LocalMetadataRepository) which is read and written during the resolving when downloading artifacts
--> P2ResolverImpl::toResolutionResult

I'll push a patch.
Comment 14 Eclipse Genie CLA 2020-11-09 05:13:01 EST
New Gerrit change created: https://git.eclipse.org/r/c/tycho/org.eclipse.tycho/+/171981