Bug 211441 - [performance] Allow to disable automatic queries synchronization on some specific queries
Summary: [performance] Allow to disable automatic queries synchronization on some spec...
Status: RESOLVED FIXED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Mylyn (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows XP
: P2 enhancement (vote)
Target Milestone: 3.2   Edit
Assignee: Steffen Pingel CLA
QA Contact:
URL:
Whiteboard:
Keywords: noteworthy
: 227093 247893 263127 (view as bug list)
Depends on:
Blocks: 239667
  Show dependency tree
 
Reported: 2007-11-29 07:13 EST by Michael Scharf CLA
Modified: 2011-06-29 07:05 EDT (History)
7 users (show)

See Also:


Attachments
mylyn/context/zip (10.03 KB, application/octet-stream)
2009-06-01 04:46 EDT, Steffen Pingel CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Scharf CLA 2007-11-29 07:13:48 EST
I have a few 'heavy' bugzilla queries. I look at them from time to time. Mylyn synchronizes those heavy queries quite often. This causes unnecessary load on the bugzilla server. I would  prefer to synchronize some queries on demand. There seems no obvious way to exclude some queries from automatic synchronization.
Comment 1 Mik Kersten CLA 2007-12-17 23:10:21 EST
Michael: note that the performance of query synchronization grows is dominated more by the number of tasks in your Task List that have changed on the server, and less by the size of the queries.  This is because we synchronize all changed tasks.  As such, I'm not sure that this control would provide you with the effect that you're after.  Is the problem that you're seeing the bugs.eclipse.org synchronization take too long?  We have other ideas for optimizing that.
Comment 2 Eugene Kuleshov CLA 2007-12-17 23:51:30 EST
Somewhat related: if disabling automatic updates for some queries will be considered, all the tasks from these queries can be also excluded from synchronization of the changed tasks.
Comment 3 Steffen Pingel CLA 2008-01-10 17:10:57 EST
While synchronizing the number of query results is not as much of an issue with Bugzilla it can consume significant bandwidth for other repositories such as JIRA which transfers all details for each queried bug. The CPU and memory overhead imposed on the repository by a complicated query can also be significant. 

Shawn and I had a quick discussed for a simple UI: A new sub-menu could be added to the context menu of a query with following items:

 Synchronization Priority ->
  Default (every time)
  Low (once a day)
  -
  Do not Synchronize
Comment 4 Eugene Kuleshov CLA 2008-01-10 18:08:57 EST
As a user I would prefer to have control on the synchronization intervals and submenu with hardcoded intervals don't provide enough flexibility. Team VCS synchronization have special UI for configuring scheduled synchronization, so maybe it would be better to have something similar in query configuration UI.
Comment 5 Mik Kersten CLA 2008-01-11 16:51:55 EST
(In reply to comment #3)
> Shawn and I had a quick discussed for a simple UI: A new sub-menu could be added
> to the context menu of a query with following items:

Did you mean this to be a per-repository setting?  Shouldn't our synchronization time be dominated by the tasks that are synchronized, but by the queries?
Comment 6 Steffen Pingel CLA 2008-01-11 17:59:33 EST
The idea was to set it per query. 
Comment 7 Mik Kersten CLA 2008-01-11 20:20:06 EST
 (In reply to comment #6)
> The idea was to set it per query.

Afaik for Bugzilla query time is more likely to be dominated by the incremental synchronization of all tasks then the synchronization time of all queries.  As such I'm not sure that this would address Michael's problem, since the 'heaviness' is not likely to come from the queries themselves.    

Before exposing more implementation details of synchronization to users I think that we need to identify the bottlenecks and see what we can do to remove them.  I created bug 215100 for that discussion.
Comment 8 Michael Scharf CLA 2008-01-25 12:51:25 EST
Another side effect of the automatic synchronization is that I get all those popups that notify me that an entry has changed in my 'heavy' query. I am only interested in immediate notifications of changes of my more specific queries. I want to synchronize the 'heavy' query on demand.....

But maybe this is a very specific requirement and it might be more confusing than helpful.....
Comment 9 Steffen Pingel CLA 2008-01-25 14:11:41 EST
That sounds close to another use-case: For planning I occasionally run a bunch of searches (e.g. open Mylyn 2.3 bugs / open JIRA bugs / open API bugs) . The search dialog only remembers the last query requiring me to renter each search. To avoid that I have added most queries to my task list but that causes extraneous synchronization and notifications. 

We have been bouncing the idea to resurrect an old Mylyn Bugzilla feature to allowed to save searches and have UI to quickly run them. Would that also help with your use-case?
Comment 10 Eugene Kuleshov CLA 2008-01-25 14:48:11 EST
I would also prefer a static set of queries in a task list (or in worst case some other view) then a search view. It would be also handy to be able to clone Task List view (bug 151432), use different presentations and "go into" and then quickly jump between particular views, which would be more flexible then single working set.
Comment 11 Mik Kersten CLA 2008-02-20 00:52:17 EST
Lowering priority since we have implemented considerably query synchronization improvements that should address some of the performance problems.
Comment 12 Mik Kersten CLA 2008-04-17 19:15:50 EDT
*** Bug 227093 has been marked as a duplicate of this bug. ***
Comment 13 Vladimir Lifar CLA 2008-04-18 06:11:03 EDT
(In reply to comment #11)
> Lowering priority since we have implemented considerably query synchronization
> improvements that should address some of the performance problems.

Are this improvements released in 2.3?
AFAICS problem still exists in 2.3. I've a lot of "low-priority" queries, and they requires too much time to update.
Comment 14 Robert Elves CLA 2008-06-14 00:55:14 EDT
Need to defer: http://wiki.eclipse.org/index.php/Mylyn/3.0_Plan#Deferred_Items
Comment 15 Mik Kersten CLA 2008-06-16 13:00:30 EDT
Unless we have clear indication that this will improve real performance, not just perceived performance, I don't think we should have it scheduled.
Comment 16 Steffen Pingel CLA 2009-02-04 14:02:59 EST
*** Bug 247893 has been marked as a duplicate of this bug. ***
Comment 17 Steffen Pingel CLA 2009-02-04 14:10:24 EST
*** Bug 263127 has been marked as a duplicate of this bug. ***
Comment 18 Kevin Benton CLA 2009-02-04 15:39:42 EST
(In reply to comment #15)
> Unless we have clear indication that this will improve real performance, not
> just perceived performance, I don't think we should have it scheduled.
> 

Certain queries do cause high load on Bugzilla servers such as those that use some of the advanced query capabilities requiring fields that are not indexed together.  In those cases, it's not up to the user to fix the indexing (and it may not be practical to do so).  I often see issues with queries that use advanced search options like changed in the past n days and/or look for text in the comments.

These types of queries can really hammer a Bugzilla server if the data set is reasonably large (i.e. 50K bugs).

BEO has a number of these types of queries that become problematic as I've seen.  At times, MyLyn would ask BEO for too much data causing my IP to become blocked due to the sheer volume of data being transmitted back to me.
Comment 19 Jörg Thönnes CLA 2009-02-05 06:13:56 EST
I am using Mylyn quite heavily with a lot of queries in different repositories grouped by working sets:

*  10 working sets
*  12 repositories:
**  Local
**  Bugzilla: Eclipse, Tasktop
**  JIRA: Codehaus, QuickFIX/J
**  Trac: Local test, Mylyn test, company Admin+Development
**  Web: GlassFish + Glassfish Plugins (IssueZilla), legacy company tracker
*  about 50 queries, most of them (35) for my local company Trac

Scheduled updates therefore put a heavy burden on my Eclipse VM so that I reduced the rate to once
every hour.

But this does not fit my needs:

*  Schedule rate should be configurable
**  per repository
**  per query (inherit from repository)
**  per working set (overrule queries/repositories in working set)

The best would be to be able to schedule this like for UNIX cron jobs. Then I could e.g. say

*  Mylyn queries at every full hour afternoon
*  Subversive queries 5 minutes past noon
*  Local company stuff: never at full hour, but every 10 minutes

Together with this also the notifications should be configurable, e.g. enable per repository, query, working set.

Focusing on a working set could also prevent some scheduled updates or suppress notifications.

I am sure this task should be broken down into sub-tasks...

As a first step I would like to see a configuration per repository.

A UNIX cron job style configuration would also be very flexible but could be used as an implementation idea
instead presenting this to the average user.
Comment 20 Jörg Thönnes CLA 2009-02-05 06:16:29 EST
Voted on this issue since I regard is as a major impact of Mylyns usability.

Would prefer a first implementation, e.g. per-repository synchronize jobs which are separately cancellable
and configurable as a sub-task.
Comment 21 Mik Kersten CLA 2009-02-06 10:52:44 EST
Kevin, Jorg: We plan to take a pass on query performance for the Eclipse Galileo-based Mylyn 3.2.  However, note from my comment#1 that our sync time is generally dominated by the number of changed tasks that have to be retrieved, and not by the time to run the queries themselves.  If we were able to read a feed of changed tasks, it would speed up dramatically and this problem might not manifest itself in a vislble way.

That said, I like your suggestion of putting the configuration on the repository settings.  That's a lot less complicated than on the queries.  Also, it enables users and potentially templates to specify slow intervals for the large OSS repositories (e.g., Eclipse, Mozilla), and a fast interval for smaller company-internal repositories (e.g., 10min).  

Rob, Steffen: thoughts?
Comment 22 Jörg Thönnes CLA 2009-02-06 11:07:34 EST
(In reply to comment #21)
> Galileo-based Mylyn 3.2.  However, note from my comment#1 that our sync time is
> generally dominated by the number of changed tasks that have to be retrieved,
> and not by the time to run the queries themselves.  If we were able to read a
> feed of changed tasks, it would speed up dramatically and this problem might not
> manifest itself in a vislble way.

In our case, the number is not so high, but the server is slow :-(
 
> That said, I like your suggestion of putting the configuration on the repository
> settings.  That's a lot less complicated than on the queries.  Also, it enables
> users and potentially templates to specify slow intervals for the large OSS
> repositories (e.g., Eclipse, Mozilla), and a fast interval for smaller
> company-internal repositories (e.g., 10min).

This sounds good as a start.

Further, it would be a good idea to have working sets to modify this behaviour:
All repos/queries in a working set are queried more often than these in the others.

Are you able to provide a first solution quickly?

If there is a background job per repository, I could cancel it by hand if it takes to long. So splitting the
synchronize job into several ones per repo would also help in the very first place.

Thanks, Jörg
Comment 23 Mik Kersten CLA 2009-02-13 18:31:11 EST
Jörg: You are starting to convince me.  While I still think that we should avoid allowing queries to have their own synchronization schedule, it might not be that bad to allow synchronization to be disabled per-query, as long as we have some kind of visual indicator that it's disabled (e.g., an overlay on the query node, and a tooltip that mentions auto synch is disabled).  Would you be interested in providing a patch for this?
Comment 24 Jörg Thönnes CLA 2009-02-14 02:30:11 EST
Mik, I was also thinking about your remarks that the UI shall be kept simple. So I wonder whether
the scheduling functionality could me made API in order to develop extra plugins which supply more
complex functionality and have a simple default implementation?
Comment 25 Mik Kersten CLA 2009-02-17 14:14:47 EST
That could make sense.  Tentatively adding to 3.2 plan.
Comment 26 Mik Kersten CLA 2009-04-13 19:00:18 EDT
Proposed UI design:
* Add a check menu item to the query's popup menu which says "Synchronize Automatically", below the "Synchronize" menu item
* Decorate those queries with a different repository icon (I'll provide that, related to "disconnected" repositories)
Comment 27 Steffen Pingel CLA 2009-06-01 04:46:40 EDT
I have added a "Synchronize Automatically" check box to the task list menu. The check box is enabled by default. If unchecked a query is not updated unless forcefully synchronized or if a synchronization of all queries is forced. 

The icon currently uses the gray style if a query is excluded from automatic synchronization but that may need improvement. 

Comment 28 Steffen Pingel CLA 2009-06-01 04:46:49 EDT
Created attachment 137816 [details]
mylyn/context/zip