Bug 232413 - [ui] Long delay opening updates dialog
Summary: [ui] Long delay opening updates dialog
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: p2 (show other bugs)
Version: 3.4   Edit
Hardware: PC Windows XP
: P2 critical (vote)
Target Milestone: 3.4 RC2   Edit
Assignee: John Arthorne CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
: 232578 232794 232923 233606 233613 234577 234859 235278 (view as bug list)
Depends on:
Blocks:
 
Reported: 2008-05-15 17:13 EDT by John Arthorne CLA
Modified: 2009-02-10 16:55 EST (History)
14 users (show)

See Also:
dj.houghton: review+


Attachments
Stack trace (8.29 KB, text/plain)
2008-05-15 17:13 EDT, John Arthorne CLA
no flags Details
Logfile showing proxy messages (115.85 KB, text/x-log)
2008-05-19 14:59 EDT, Martin Oberhuber CLA
no flags Details
proposed changes for background fetch of IU info (9.36 KB, patch)
2008-05-20 13:06 EDT, Susan McCourt CLA
no flags Details | Diff
correct patch (9.31 KB, patch)
2008-05-20 13:12 EDT, Susan McCourt CLA
no flags Details | Diff
replacement patch (9.49 KB, patch)
2008-05-20 20:11 EDT, Susan McCourt CLA
no flags Details | Diff
latest and greatest, really... (12.38 KB, patch)
2008-05-21 19:01 EDT, Susan McCourt CLA
no flags Details | Diff
Small update to previous patch (12.39 KB, text/plain)
2008-05-22 11:22 EDT, John Arthorne CLA
no flags Details
Cache localization fragments in a soft ref (5.41 KB, patch)
2008-05-22 14:45 EDT, Dave Stevenson CLA
dstevens: review?
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description John Arthorne CLA 2008-05-15 17:13:10 EDT
Created attachment 100552 [details]
Stack trace

Build id: I20080515-1030

I clicked Help > Software Updates, and then there was a delay of about 10 seconds where the UI was frozen and the dialog did not open.  From a stack trace I can see it is querying repositories to determine localized properties.
Comment 1 Susan McCourt CLA 2008-05-15 18:26:37 EDT
This is a specific example of my general worry yesterday - the fact that a load repo can be triggered almost anywhere now in the UI thread when an IU property is accessed.  

What's happening here is that the deferred fetch gets triggered, and apparently has already returned IU elements.  But the comparator is trying to sort the installed items and it is the first guy asking that a repo get loaded, and it is running (as it should) in the UI thread.  This is happening before the labels are fetched, so I don't think we could grab the data from SWT (ie, if the labels had already been gotten from IUPropertyUtils, we wouldn't be loading this repo now, would we?)

I'm really not sure what to do about this, I'm wondering if we should make IUPropertyUtils do a query only on already loaded repos, and provide placeholder info in the meantime.  After the repos load and the various queries that are needed are completed, the items would be refreshed, or else we could put a listener on the load and refresh them manually.
Comment 2 Pascal Rapicault CLA 2008-05-15 21:54:55 EDT
Several ideas:
- One quick workaround would consists in enabling the translation of the feature only based on a preference. This would allow us to always query for the default strings which would always return immediately. 
Of course in case where the user runs in the non-default language then the problem still persists.
- Open the dialog and fill it in the background.
- Install the translation IU in the system. this could still cause lookups depending on which keys are available.
Comment 3 Susan McCourt CLA 2008-05-15 21:59:02 EDT
>- Open the dialog and fill it in the background.
We are already doing this.  The problem is in the sorting which is done in the UI (and must be as it is the UI objects that do it).  The viewer's sorter is retrieving the property name from the UI thread. 
Comment 4 Susan McCourt CLA 2008-05-16 14:02:51 EDT
marking RC2 as this is a regression of all the work done to ensure that no repository loads happen in the UI thread.

This one is tricky.  In the past, only the available view would trigger a load.  It uses a deferred content mechanism to get its contents.  Since getting the contents are what causes the repo load, this happens in the background as it should.

Now we have an issue where the installed view needs to load.  I stated previously that we should be okay because that fetch is done in the background, too.  But the problem is that getting the contents of the profile rightfully has no need to trigger a load, because no properties are accessed while fetching.

So the first time someone needs the name, we pay the price, and of course this is either going to be sorting, filtering, or fetching the label, all things that must happen in the UI thread.

I'm thinking I could hack around this by having the IU collector query the iu's name when it is first fetched.  The name is going to be needed in the long run, so at least this would ensure that the query and necessary loading happens in the background.

Comment 5 John Arthorne CLA 2008-05-16 15:37:50 EDT
*** Bug 232578 has been marked as a duplicate of this bug. ***
Comment 6 Pascal Rapicault CLA 2008-05-16 15:46:02 EDT
Rather than doing crazy changes in the UI I want to cache the result of a more general query in a SoftReference in the IUPropertyUtils.
We will still take the hit of the repo load, but we will at least not do the quadratic search.
Comment 7 Susan McCourt CLA 2008-05-16 20:12:46 EDT
The problem is that the load can be extremely time-consuming.  It's not a big change to query the name when we fetch an IU.  I think this is critical...better to show "pending..." right away than have an hourglass and no explanation.  On my connection (lots of site, slow connection, bad sites that can timeout) it can take minutes.  

On the bright side, the rest of the pauses in the available features dialog should go away now since we are loading every single repo on dialog open.
Comment 8 John Arthorne CLA 2008-05-16 23:06:07 EDT
I agree with Susan. We need to get the dialog visible before we start hitting remote repositories. This is quite brutal on a slow connection, and it can really just look like it's hung because CPU use is very low. Optimizing the queries once the dialog is open is also a good thing to explore.
Comment 9 Pascal Rapicault CLA 2008-05-17 15:24:55 EDT
Another way to prevent connection to the outside word could be to limit the lookup of the translations to repositories that have a local cache since for this the precision is not paramount.
Comment 10 Pascal Rapicault CLA 2008-05-19 10:26:22 EDT
*** Bug 232794 has been marked as a duplicate of this bug. ***
Comment 11 Martin Oberhuber CLA 2008-05-19 10:51:47 EDT
Build ID: I20080516-1333 (3.4rc1)
For me, this is a blocker, blocking any further testing of P2 since in my Dev Environment I have registered 10 update sites, most of which are quite slow and remote (sitting in Europe, even download.eclipse.org is relatively slow).

This results in a > 5 minutes TOTAL BLOCK of Eclipse whenever I choose Help > Software Updates.. accidentally.
Comment 12 Susan McCourt CLA 2008-05-19 12:36:08 EDT
I understand your pain, Martin.  This is first on my list for RC2.  I didn't think of a low risk fix until the RC1 final build was well on its way.
Comment 13 Susan McCourt CLA 2008-05-19 14:16:20 EDT
I just reread this part:

>For me, this is a blocker, blocking any further testing of P2 since in my Dev
>Environment I have registered 10 update sites, most of which are quite slow and
>remote (sitting in Europe, even download.eclipse.org is relatively slow).

>This results in a > 5 minutes TOTAL BLOCK of Eclipse whenever I choose Help >
>Software Updates.. accidentally.

Martin, I'm surprised you are seeing this every time you open the dialog.  The information is cached.  My experience is that the block is painful, but having endured it, the dialog opens quickly on subsequent opening.  It's not blocking my testing (and I'm on a slow connection so I know how bad this problem can be).

The cache is a soft reference, so you could be getting burned by a gc of the cache.  Maybe I'm not impacted since I'm opening this dialog constantly while testing.   Increasing your memory in the vm args might help?

Another trick would be to include the p2 admin UI in your configuration and open a repo view and leave it lying around.  That will hold onto the cache.
Comment 14 Martin Oberhuber CLA 2008-05-19 14:45:18 EDT
I've had my Eclipse running all day today, I've been opening Help > Software Updates... some 3 or more times today and delays did vary -- first time was approx 8-10 minutes; 2nd time was faster but certainly > 2-3 minutes; 3rd time was approx 5-6 minutes; 4th time now was about 8 minutes again.

My eclipse.exe is 361700K mem usage according to task manager, and my vmargs are 
   -Xmx512m -XX:MaxPermSize=128m
That should suffice?

I had first thought that it should be cached, but it didn't seem so for me. Perhaps checking the cache update state is > 2 minutes already for me? Just searching the TM Update site (bug 232819) takes about 1 minute for me.

I agree that I _could_ be testing some P2 stuff if I really wanted, by getting rid of some extensions. But then that's my normal dev environment, and we should start testing P2 in real life, and I really cannot do that with the Performance I currently see because it blocks my productivity. 
Comment 15 Martin Oberhuber CLA 2008-05-19 14:59:41 EDT
Created attachment 100955 [details]
Logfile showing proxy messages

I just noticed lots of the following messages in my .log file, which might be related to P2 UM being so slow -- full .log is attached. These messages seem to be logged exactly at the times when I open Help > Software Updates...:

!ENTRY org.eclipse.core.net 1 0 2008-05-19 20:15:45.092
!MESSAGE System property http.proxyHost has been set to proxy.eclipse.org by an external source. This value will be overwritten using the values from the preferences

!ENTRY org.eclipse.core.net 1 0 2008-05-19 20:15:45.092
!MESSAGE System property http.proxyPort has been set to 80 by an external source. This value will be overwritten using the values from the preferences
Comment 16 Pascal Rapicault CLA 2008-05-19 15:17:50 EDT
As a workaround, removing the repos from the list would obviously help with the opening time, however then you would have to add them back.... Adding them back could be done by taking a snapshot of all the repos using the import/export functionality in the "managed sites".
Comment 17 John Arthorne CLA 2008-05-20 09:24:33 EDT
*** Bug 232923 has been marked as a duplicate of this bug. ***
Comment 18 Martin Aeschlimann CLA 2008-05-20 09:51:40 EDT
From bug 232413:

This blocked the UI thread for several minutes. All done in the UI thread, no
UI refreshing, no progress monitor, no cancel button.

Comment 19 John Arthorne CLA 2008-05-20 10:31:25 EDT
Re comment #18. Yes, we know it's really bad. The regression was introduced last week, and we didn't realize quite how bad it was until the final RC1 build was underway. This week we will be taking this out of the UI thread, and also optimizing the code to reduce the delay.
Comment 20 Susan McCourt CLA 2008-05-20 13:06:40 EDT
Created attachment 101098 [details]
proposed changes for background fetch of IU info
Comment 21 Susan McCourt CLA 2008-05-20 13:11:01 EDT
>I'm thinking I could hack around this by having the IU collector query the iu's
>name when it is first fetched.  The name is going to be needed in the long run,
>so at least this would ensure that the query and necessary loading happens in
>the background.

This was not the one-liner I had hoped.  Our DeferredQueryContentProvider makes two assumptions that block us:
- the viewer is a tree (so it can use the platform deferred tree content manager).  Currently the installed content is in a table
- the expensive part of getting children is getting the queryable, such as the repo itself, not the work that happens in the query itself.  Now we are proposing to do something really expensive in the query.

This patch provides a very specific fix by starting a fetch job when getting profile content and replacing the placeholder with the real children when finished.  It's more code than I had hoped.   This patch needs further testing before review, I'm not proposing it's ready as is, I'm just trying to quickly report the nature of the change that would be required.

Comment 22 Susan McCourt CLA 2008-05-20 13:12:49 EDT
Created attachment 101099 [details]
correct patch

this is the right patch, the other one was an older copy
Comment 23 Susan McCourt CLA 2008-05-20 13:20:18 EDT
Reassigning to John, as there are multiple fixes being proposed.  I will test with the provided patch today and ping here when i think the UI part is ready for review.
Comment 24 Susan McCourt CLA 2008-05-20 13:44:08 EDT
Also wanted to mention that I cancel the fetch job when the user closes the dialog, but I can see it running in the progress view long after.  This is because of bug 232440.  
Comment 25 Susan McCourt CLA 2008-05-20 17:50:06 EDT
Running this patch today, I've seen a couple more things to address:
- the update button stays invalid until you select something because it initially validates on the "pending" placeholder.  
- the fetch job should show local progress as we do in the available list when filtering/loading.

Comment 26 Susan McCourt CLA 2008-05-20 20:11:35 EDT
Created attachment 101171 [details]
replacement patch

This patch addresses this comment:
>- the update button stays invalid until you select something because it
>initially validates on the "pending" placeholder.    

I added code to explicitly set an empty selection on the viewer when the fetch is done, so that listeners will be notified of a selection change.

>-the fetch job should show local progress as we do in the available list when
>filtering/loading.
I elected not to do this.  I tried it but it was very heavyweight, and then I realized we don't show local progress when we are "Pending..." in the available view.  We only do this when filtering forces a load of all repos, and the user is waiting to continue.

I ran with this code all today and haven't run into any other issues.
It is ready for review.

Obviously this patch doesn't really solve the problem because the user is still waiting forever to see anything.  But at least they cancel...
Comment 27 John Arthorne CLA 2008-05-21 14:40:21 EDT
It worries me to add this hack to the fairly generic InstalledIUCollector. Are you sure there aren't other places where this collector is used that might introduce this expensive code hit in new places? For example it looks like AutomaticUpdateScheduler uses it.
Comment 28 Susan McCourt CLA 2008-05-21 17:42:05 EDT
You are right, John, this same query is done to get the installed IU's when looking for updates.  One caller is in earlyStartup(), but the other is the preference page when the user changes the update schedule.  This could indeed cause a reload because the soft references could be gone.  Need to find a way to handle this, right now all clients use the same collector.
Comment 29 Susan McCourt CLA 2008-05-21 19:01:19 EDT
Created attachment 101405 [details]
latest and greatest, really...

this patch adds a variable to InstalledIUCollector that drives whether a fetch of the IU property is done.  This defaults to false and is set to true by ProfileElement, when ProfileElement has been instructed to do so.  InstalledIUGroup is the only client that instructs the ProfileElement to do this.

Verified by putting a breakpoint in the name retrieval and running automatic updates, changing the update schedule, browsing automatically found updates, and browsing updates using the "Update..." button.  The only place this behavior is triggered is when opening the dialog.
Comment 30 John Arthorne CLA 2008-05-22 11:22:23 EDT
Created attachment 101533 [details]
Small update to previous patch

This update fixes a bug in InstalledIUCollector introduced in the previous patch.
Comment 31 John Arthorne CLA 2008-05-22 12:18:37 EDT
+1 on latest patch
Comment 32 Susan McCourt CLA 2008-05-22 13:06:09 EDT
+1 from me on this patch.

Tested again with breakpoint on the name fetch.
Same results:  Did not fetch the name when running automatic updates, changing the update schedule, browsing automatically found updates, and browsing updates using the "Update..." button.  Only when populating the installed list.  Changing the update schedule is the one that is bad because it happens in the UI thread when changing prefs.  Good here.

To verify that the correct elements are collected when *not* fetching the name, I saw the SDK update proposal in the automatically found updates, so it did correctly query the roots and find updates.  Verified that the updates found using "Update..." (which uses the install list already retrieved by the InstalledIUGroup) and the updates found automatically are the same, which should be the case if you haven't installed anything or changed repos since the check happened.

I also scheduled a daily update for 10am and when it fired, it had the correct updates and did not hit the repos on the name query.

Comment 33 John Arthorne CLA 2008-05-22 14:33:42 EDT
The patch has been released for the 1pm build. Marking fixed. I have opened bug 233523 to discuss further optimization of the query to make this faster.
Comment 34 Dave Stevenson CLA 2008-05-22 14:45:12 EDT
Created attachment 101588 [details]
Cache localization fragments in a soft ref

This patch implements Pascal's suggestion in comment #6: "I want to cache the result of a more general query in a SoftReference in the IUPropertyUtils"
Comment 35 John Arthorne CLA 2008-05-22 23:26:38 EDT
*** Bug 233606 has been marked as a duplicate of this bug. ***
Comment 36 Susan McCourt CLA 2008-05-23 11:15:48 EDT
*** Bug 233613 has been marked as a duplicate of this bug. ***
Comment 37 John Arthorne CLA 2008-05-29 09:35:56 EDT
*** Bug 234577 has been marked as a duplicate of this bug. ***
Comment 38 Susan McCourt CLA 2008-05-30 13:05:59 EDT
*** Bug 234859 has been marked as a duplicate of this bug. ***
Comment 39 John Arthorne CLA 2008-06-03 11:38:11 EDT
*** Bug 235278 has been marked as a duplicate of this bug. ***
Comment 40 Martin Oberhuber CLA 2009-02-10 16:55:48 EST
I still have problems initially opening the "Install New Software..." dialog when the Internet connection is slow or unreliable, see bug 264427. The UI thread remains usable, but this doesn't help me since I'm unable to bring up the "Install New Software" dialog for installing from a locally mirrored or archived update site...