Bug 235923 - Resolving dependencies is very slow even if everything is available locally
Summary: Resolving dependencies is very slow even if everything is available locally
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: p2 (show other bugs)
Version: 3.4   Edit
Hardware: PC Linux-GTK
: P3 normal (vote)
Target Milestone: 3.5 M5   Edit
Assignee: P2 Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
: 248514 249753 (view as bug list)
Depends on: 232792
Blocks: 517025
  Show dependency tree
 
Reported: 2008-06-05 16:11 EDT by Martin Oberhuber CLA
Modified: 2017-05-20 23:41 EDT (History)
6 users (show)

See Also:


Attachments
Thread dump of hanging application (23.84 KB, text/plain)
2008-06-05 19:16 EDT, Martin Oberhuber CLA
no flags Details
Screenshot showing license screen with duplicate feature (22.48 KB, image/gif)
2008-06-05 19:21 EDT, Martin Oberhuber CLA
no flags Details
.log of my workspace after installing (18.82 KB, text/plain)
2008-06-05 19:31 EDT, Martin Oberhuber CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Oberhuber CLA 2008-06-05 16:11:31 EDT
Build ID: eclipse-platform-3.4RC3-linux-gtk (I20080523-0100)
Host: RHEL4u2

I'm sorry if this is a duplicate of some other performance / p2 caching related bug that I've already filed... I believe there was something ongoing with resources, translations and caching... but I forgot... anyways, here we go:

Download CDT-5.0RC3 archived update site from http://download.eclipse.org/tools/cdt/builds/5.0.0/I.I200805300802/cdt-master-5.0.0-I200805300802.zip

1. Extract and Launch Eclipse-Platform-3.4RC3
2. Help > Software Updates... : Available Software : Add Site...
   Archive: (localFolder)/cdt-master-5.0.0-I200805300802.zip
3. From "jar: CDT Main Features", pick "C/C++ Development Tools"
   From "jar: CDT Optional Features", pick 
        "CDT GNU Toolchain Build Support",
        "CDT GNU Toolchain Debug Support"
   Press Install...

--> Resolving Dependencies... Dialog comes up (cannot be sent to background)
    and stays on for 2:45 minutes. During that time, a progress in the Eclipse
    Window in the background occasionally shows things like "http:downl...jar"
    which seems to indicate that it's downloading some stuff.

Having to wait so long even if ALL the software and dependencies that I need have been downloaded already is not acceptable. I'm enjoying a super fast internet connection, but sitting in Europe, access to Eclipse.org servers does take some time due to latencies.

Installing from an archived, or locally mirrored update site should not require any kind of network connections, not for any kind of caching or whatever.
Comment 1 Pascal Rapicault CLA 2008-06-05 16:49:34 EDT
Could you please see how many repositories are enabled in you available software dialog and their URLs?
Comment 2 Martin Oberhuber CLA 2008-06-05 18:06:53 EDT
There's 4 of them:
* "Ganymede Update Site" (default one)
* "http://download.eclipse.org/eclipse/updates/3.4" (default one)
* "jar:file:/folk/mober/Downloads/cdt-master-5.0.0.I200805300802.zip"
* "http://download.eclipse.org/tools/cdt/releases/ganymede/"

The last one got added automagically while P2 was parsing my archived site.

After having installed the stuff (and having waited), when I expand the Ganymede node, I see a "Pending..." subnode for a long time, so it looks like that one has not been processed at all. The 2nd one (eclipse/updates/3.4) expands immediately but doesn't show any children; the two CDT ones also expand immediately, so it looks like they have been cached.

I think that when all information about CDT is in my archived update site, P2 shouldn't go and contact the CDT update site on the net.
Comment 3 Martin Oberhuber CLA 2008-06-05 18:12:15 EDT
Another observation is that when I use a locally rsync'd mirror of the Target Management Update Site from
  http://download.eclipse.org/dsdp/tm/signedUpdates

I do not see the slow "Resolving Dependencies" behavior. It expands in the UM UI immediately, and shows the "Resolving" dialog for few seconds only.

My TM Update Site does include P2 metadata (artifacts.xml, content.xml), whereas the CDT archived site only contains site.xml -- perhaps that's the reason for the slowness, P2 tries to contact the remote in order to find precomputed P2 metadata?

I think it shouldn't do so, and compute the metadata itself from the contents in the archived (or local) site only. But then, of course we could also request CDT (and all other Ganymede projects) to add the P2 metadata to their sites / archived site downloads.
Comment 4 Martin Oberhuber CLA 2008-06-05 18:45:05 EDT
I think I need to take back the last comment: On futher investigations on a different installation, I saw that installing from my locally mirrored "Target Management" site still retrieves 
  http://download.eclipse.org/dsdp/tm/updates/3.0/content.jar
although that very file is available in my local variant of the site (which came from .../signedUpdates, but my point is that I do have a content.jar describing the site local).

I also noticed that it tries to get .../releases/ganymede/content.jar and that seems what makes most of the processing time. I'm only installing from my local TM site here, and all dependencies are local. It shouldn't care for contacting the Ganymede site for whatever reason.
Comment 5 Martin Oberhuber CLA 2008-06-05 19:16:29 EDT
Created attachment 103858 [details]
Thread dump of hanging application

For my "Installing TM" example, it's now hanging completely since half an hour or so. Thread dump attached. Here is what I did in this case:

1. Install eclipse-SDK-3.4RC3
2. Install CDT-5.0RC3 by means of commandline standaloneUpdate
3. Help > Software Updates... point to local mirror of TM site
   http://download.eclipse.org/dsdp/tm/signedUpdates
4. Select RSE-SDK only 
5. Install

First it looked like it contacted the original TM site on the Web, then the Ganymede site, but since half an hour or so it looks like it's totally stuck.

I must mention that I've made tons of different Eclipse-3.4rc3 and other installations in different folders with the same user id, and I've had the impression that with P2 they somehow influence each other (could it be that there is some configuration area under $HOME?)

It looks like I've messed up things over time. What do I need to clean out to restart from scratch?
Comment 6 Martin Oberhuber CLA 2008-06-05 19:21:55 EDT
Created attachment 103859 [details]
Screenshot showing license screen with duplicate feature

Bummer. I had just about given up, when P2 came up with attached screenshot.

What's really odd here, is that it presents two copies of the identical item (once with the readable name and once with the JAR). How cool is this?

The screenshot reminds me that I had lied a little: In the Updates... dialog from the TM Local site, I had only selected the "RemoteCDT" feature (and not RSE-SDK) and nothing else. Expecting that P2 needs to resolve dependencies because RemoteCDT requires RSE-Core.

Again, apart from potentially other issues that I've uncovered here, my main point is that I've got everything local so P2 must not spend ages searching the Internet.

Apart from that, one most important nugget of information for me is how to clean out my install / shared configuration area on Linux.
Comment 7 Martin Oberhuber CLA 2008-06-05 19:31:51 EDT
Created attachment 103864 [details]
.log of my workspace after installing

Ha, this time I've also got a log. God alone knows where the ".../europa/" update site reference comes from. I cannot find it in the CDT features that I installed. P2 doesn't dream that up, does it?
Comment 8 John Arthorne CLA 2008-06-05 21:12:06 EDT
No, it's not dreamed up. There must be a feature somewhere referring to the europa site. There must be a repository or feature somewhere that is referring to it. This could be a source of some of the slowness since it is a legacy update site with no additional metadata. For me it takes about 30 seconds to load this site for the first time.
Comment 9 Martin Oberhuber CLA 2008-06-05 21:17:14 EDT
John - is there any configuration area outside my workspace where P2 stores data that might be shared between installs, and that I'd better clean? I'm working on Linux (RHEL4).
Comment 10 Martin Oberhuber CLA 2008-06-05 21:29:19 EDT
FYI, after a hint on the cdt-dev list I tried with a newer CDT driver (0602 build instead of the 0530 build that is RC3). It seems like that one's working better -- I'm not getting the "europa" errors any more. But, I couldn't find a cdt bug report talking about removal of an europa related update site...
Comment 11 John Arthorne CLA 2008-06-05 21:40:14 EDT
> John - is there any configuration area outside my workspace where P2 stores
> data that might be shared between installs, and that I'd better clean?

No, all p2 data is in the agent data folder. In a standalone Eclipse install, this is the "eclipse/p2/" folder. The configuration area is also used for some caches (eclipse/configuration). These caches can be cleaned up with the -clean command line argument.

Note you can also control what remote sites are accessed from the "Manages Sites" dialog. Unchecked sites will not be contacted (or if they were, that would be a bug).
Comment 12 Martin Oberhuber CLA 2008-06-05 21:55:33 EDT
According to progress, they definitely were contacted. I'm sure I had unchecked all sites except the local mirrors (archive or expanded).
Comment 13 Pascal Rapicault CLA 2008-06-05 22:13:21 EDT
There are two items in the list because in reality 2 things are being installed:
 - the feature jar, which is under one license
 - the IU group derived from the feature, which also has another license.

Note that in the future, we will be showing the license of plug-ins and are hoping to work with the foundation to redefine what is the appropriate content for the license of a group.

Also we would like to get some ways to identify licenses in a unique way to avoid showing too many licenses like we do today and have a button to always accept a given license (see bug 192678 and bug 212218).

As for the europa reference I'm assuming that it is coming from one of the CDT feature. To avoid this kind of inconvenience I have asked metadata to be generated even for these old update sites (bug 235955).
Comment 14 Pascal Rapicault CLA 2008-06-05 22:39:12 EDT
> I'm only installing from my local TM site here, and all dependencies are local. It shouldn't care for contacting the Ganymede site for whatever reason.
   This is the kind of difference that killed UM, made the user's life miserable as well as the one from the packaging author :-)
   For example, what would happen if the local TM site was missing some of its dependencies (willingly or not).
   p2 does not have a notion of provenance. IUs are not bound to sites as features were. Once you selected the IU in the UI (note that you could be in a mode where you just see the categories across sites), it is just an IU and the resolver will try to satisfy its dependencies consulting all the repositories, since another repository could always contain something better that we want to install (for example imagine Site 1 as EMF 1.0 and site 2 as EMF 1.1 and both are suitable, if we were to stop at site 1, because it worked the user may get prompted right after for an update).

   Note that under the cover the set of repos consulted can be controlled, however then the difficulty is to come up with a user interaction model that does not put the question about the mirror in his face all the time. I can think of ways, but anyway it is probably not for 1.0.
Comment 15 Martin Oberhuber CLA 2008-06-06 06:58:57 EDT
(In reply to comment #14)
>    This is the kind of difference that killed UM, made the user's life
> miserable as well as the one from the packaging author :-)
>    For example, what would happen if the local TM site was missing some of its
> dependencies (willingly or not).

I see your point, and I appreciate the cool advance in technology that P2 is.

But in my case, I have explicitly selected to search my "archived site" repository only. I have explicitly DESELECTED all other known repositories. P2 REALLY shouldn't contact any repositories that I have deselected.

Even if it thinks that it might find something better than I have selected.

Having to wait 2:45 minutes for an install from local that used to take in the range of seconds is a regression of performance that's not acceptable for me. And mind you, this is just with 4 repositories (2 of them local!) What if I have a huge install like Ganymede with 20 Repositories, all of them remote? How long will I need to wait then, until I can install from my local archived site?

I don't have the time to test this scenario now, but if I imagine this taking in the range of 20 minutes then this issue would be a blocker for our product, IMHO.
Comment 16 Martin Oberhuber CLA 2008-06-06 07:00:33 EDT
PS reading your note about "question of mirror" again, I think the key point of this issue is perhaps that P2 should not ever try to find any mirrors for stuff that's local (local site or archived site).

Even if IU's don't care about prvenance: Local sites *are* different than remote network sites because for local sites, mirroring absolutely makes no sense.
Comment 17 John Arthorne CLA 2008-06-06 09:27:47 EDT
Confirmed looking at the code that mirroring is disabled when running with a local (file:) repository. Code is in SimpleArtifactRepository#getMirror.  Just to clarify, by "local" do you mean a repository whose location uses file: URL, or is it an HTTP server on your local LAN?
Comment 18 Martin Oberhuber CLA 2008-06-06 09:30:15 EDT
file:// URL, entered by means of the Dialog (Add Site .. press "Local" button or "Archive" button. John if you like, I could set up a Webex or VNC so you can watch me reproduce the issue. 

Though I think that with my steps it should be easy to reproduce for anyone.
Comment 19 Martin Oberhuber CLA 2008-06-06 09:56:30 EDT
What bugs me is that it seems to be reading http:...ganymede... although that Repository definitely wasn't selected and I don't want to get anything from it. Could it be that P2 tries to verify / update some cache?
Comment 20 Susan McCourt CLA 2008-06-06 15:55:22 EDT
I can't speak for why a remote repo would be contacted if you had disabled it in your list.  But I wanted to point you to a general discussion of the repo model as it relates to update manager in case you have thoughts to contribute: bug #234213

Comment 21 Michael Scharf CLA 2008-06-06 20:36:24 EDT
The hanging modal context stack trace looks very similar to stack trace of bug 235140...
Comment 22 John Arthorne CLA 2008-09-25 13:44:08 EDT
*** Bug 248514 has been marked as a duplicate of this bug. ***
Comment 23 John Arthorne CLA 2008-10-10 16:59:32 EDT
*** Bug 249753 has been marked as a duplicate of this bug. ***
Comment 24 Martin Oberhuber CLA 2009-02-10 16:53:38 EST
Just having tried with Eclipse 3.5m5, I believe that the dependency computation performance issues mentioned in this bug are actually fixed; but now, there are problems even bringing up the initial "Install New Software" dialog if the connection is slow.

I have filed bug 264427 for the new issues, this current one might be fixed.
Comment 25 John Arthorne CLA 2009-02-11 22:13:18 EST
Yes, the work of loading the remote repositories has been moved to run before the dialog opens. This way once the dialog is open everything is relatively fast. I'll close this as fixed since you've opened a new one for the same delay that is now in a different place <g>.