72358 – [Viewers] Support SWT.VIRTUAL style

Bug 72358 - [Viewers] Support SWT.VIRTUAL style

Summary: [Viewers] Support SWT.VIRTUAL style

Status:	VERIFIED FIXED

Alias:	None

Product:	Platform
Classification:	Eclipse Project
Component:	UI (show other bugs)
Version:	3.0
Hardware:	PC Windows 2000

Importance:	P2 normal with 2 votes (vote)
Target Milestone:	3.1 M3
Assignee:	Tod Creasey
QA Contact:

URL:
Whiteboard:
Keywords:

Duplicates (1):	57795 (view as bug list)
Depends on:	76391
Blocks:	57795
	Show dependency tree

Reported:	2004-08-20 11:42 EDT by Nick Edgar
Modified:	2004-12-01 05:07 EST (History)
CC List:	16 users (show)

See Also:

Attachments
Patch for work so far (2.25 KB, patch) 2004-08-26 11:57 EDT, Tod Creasey	no flags	Details \| Diff
Patch for o.e.ui.ide (10.09 KB, patch) 2004-10-04 21:44 EDT, Stefan Xenos	no flags	Details \| Diff
o.e.jface patch (92.04 KB, patch) 2004-10-04 22:15 EDT, Stefan Xenos	no flags	Details \| Diff
patch for o.e.ui.tests (42.28 KB, patch) 2004-10-04 22:27 EDT, Stefan Xenos	no flags	Details \| Diff
Updated table viewers and test suites (9.36 KB, application/octet-stream) 2004-10-15 10:17 EDT, Tod Creasey	no flags	Details
A version of the TableViewer to support virtual (27.22 KB, text/plain) 2004-10-15 16:38 EDT, Tod Creasey	no flags	Details
Show Obsolete (5) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Nick Edgar

2004-08-20 11:42:29 EDT

3.0

JFace viewers (namely the TableViewer and CheckBoxTableViewer) should support
SWT.VIRTUAL style.  When this style is specified, SWT.SetData callbacks are
fired to fill the items on demand.  If SWT.VIRTUAL is specified on the table
passed to its constructor, the viewer should avoid populating the table eagerly,
but add an SWT.SetData event handler to do the work as needed.

This should work with the existing content provider API, but this requires
determining all model elements first, even if the SWT items are not eagerly
updated.  The existing content provider API can be used to determine the number
of model elements, and configure the table appropriately (see
Table.setItemCount(int)). 
Should also extend content providers to allow the number of elements to be
requested independently of the model elements themselves.

Comment 1 Nick Edgar

2004-08-20 13:44:37 EDT

Candidate uses:
- problems view
- search results (when in table mode)

Comment 2 Tod Creasey

2004-08-25 16:32:10 EDT

Adding Stefan as this will likely affect some problems view work he is looking 
into.

Comment 3 Tod Creasey

2004-08-26 11:57:56 EDT

Created attachment 14195 [details]
Patch for work so far

Here is a quick patch for what I have done so far. I am going to hold off
releasing anything as we are hoping to revisit the Viewer story more completely
in 3.1 as there are quite a number of things we wish to support more
generically.

Comment 4 Tod Creasey

2004-09-27 16:07:46 EDT

*** Bug 57795 has been marked as a duplicate of this bug. ***

Comment 5 Stefan Xenos

2004-10-02 00:00:55 EDT

Tod: I've been working on a patch that moves the sorting and filtering into a
background thread -- it also speeds up insertions drastically.

I'm still trying to get removals working fast enough, but sometime next week
I'll try to attach a patch here that implements the algorithm to fix this one,
bug 62002, bug 21284, and bug 28789...

Comment 6 Ed Burnette

2004-10-02 00:57:49 EDT

I've had use cases where I don't really know the number of elements in the 
table or tree or text file until I go out and read them, which I don't want to 
do because it could take an arbitrarily long amount of time. I'm hoping this 
kind of case will be addressed by the new API. And yes, that makes 
scrolling... interesting. In other programs we've used either a fixed size 
thumb or a thumb that gets smaller and smaller as you scroll down (to some 
minimum size). And we left some slack at the end so there was room to click to 
page down under the thumb.

Comment 7 Konstantin Scheglov

2004-10-04 00:45:22 EDT

  In Delphi from database tables, when it does not know exact number on records
it just shows thumb in two states: top (when we are on first record), center
(when we are on some other record). When you do page up, it scrolls on one page
up, but stays thumb stays in center (if begin or record set is not finished).
  Once dataset is fully fetched, it starts to show also bottom state and any
intermediate states.

Comment 8 Stefan Xenos

2004-10-04 21:44:36 EDT

Created attachment 14991 [details]
Patch for o.e.ui.ide

Adds a concurrent content provider for the ConcreteMarkers used in the Problems
view (for testing)

Comment 9 Stefan Xenos

2004-10-04 22:15:25 EDT

Created attachment 14992 [details]
o.e.jface patch

Rough patch for JFace -- adds the core classes needed for incremental
background sorting and SWT.VIRTUAL support. 

Does not include any changes to TableViewer, but includes all the algorithms
and data structures for the concurrency bits. To use this stuff in TableViewer,
do the following:

1. Create a ConcurrentTableManager as a member variable. (It encapsulates all
the threading issues and nifty data structures that make this work).

2. Implement IConcurrentTable in TableViewer or an anonymous inner class, and
use it to get callbacks from ConcurrentTableManager (all the callbacks will
happen in the UI thread).

3. New content providers will need to implement IConcurrentContentProvider.
This provides change notifications and uses a "requestUpdate" method which is
essentially an asynchronous getElements. Content providers can read their input
in a background thread and report whatever is known so far using a sequence of
change notifications.

4. A standard IConcurrentContentProvider can be implemented to wrap old-style
structured content providers -- it won't provide any change notifications, but
TableViewer can delegate its add/remove/update methods to the versions on
ConcurrentTableManager, so should be backwards-compatible. 

5. TableViewer won't need its map of elements onto TableItems -- the new
callbacks will include the TableItem index when updating an item, so no map is
needed... but it should keep a map of elements onto the last known result of
all the label/colour/font/whateverProviders -- otherwise there will be
increased overhead due to label providers (the new algorithm avoids slow
insertions/removals by overwriting the visible TableItems rather than inserting
or removing items).

This is enough to implement and test the changes to TableViewer, but there is
still lots to do:
- Lots of javadoc issues, unused code to clean up, etc.
- No background filtering yet (currently, filtering needs to be implemented in
the content provider -- we can probably worry about this in version 2. :-)  ).
- Scroll position needs some finessing. Currently, it remains on the same line
(by index) when items are added/removed, but it should probably follow whatever
item is at the top of the set of visible items.

BTW, I've done some initial benchmarks and this is FAST. Check out the next
attachment.

Comment 10 Stefan Xenos

2004-10-04 22:27:32 EDT

Created attachment 14993 [details]
patch for o.e.ui.tests

Adds tests and examples. In particular:

- TestConcurrentTable is an example TableViewer-like class that demonstrates
how to talk to the concurrency stuff. (It only has one column which displays
the objects' toString() values)
- ListContentProvider is an example IConcurrentContentProvider that is
essentially a list-with-change-notifications
- ConcurrentTableTestView is a view containing a concurrent table. It allows
large numbers of random elements to be inserted/deleted interactively (to
remove the 400-element limit, you need to comment out the line that says
"table.setLimit(400)")
- TestMarkerView is a view that displays all markers in the workbench using a
concurrent table. It is somewhat faster than the problems view and can display
new markers as they are discovered by the build rather than waiting.

Comment 11 Stefan Xenos

2004-10-04 22:38:33 EDT

Tod: I think this includes everything we've been discussing... but updating
TableViewer itself is still going to be a scary task. Good luck. :-)

Please let me know if I've overlooked anything... if everything's okay, I'll
continue testing and cleaning up what I've attached here.

Comment 12 Stefan Xenos

2004-10-04 23:33:06 EDT

Regarding comment 6 and comment 7:

This algorithm should be able to handle large tables much better than the
current TableViewer. The content provider can search concurrently, and report
records to the TableViewer as it finds them. Insertions will be very fast. 

Performance-wise, it should work quite well... but the scroll bar won't behave
as you describe. As elements are inserted below the currently-visible range, the
scrollbar will tend to move up.

Note: these algorithms assume that the viewer is doing some sort of sorting. If
the viewer was unsorted (ie: if the content provider is determining the order),
then a faster algorithm would still be possible.

Comment 13 Ed Burnette

2004-10-05 13:13:25 EDT

It looks like you're assuming a model where you start off displaying the top 
part of a table, and there is a background thread that is streaming data into 
the table as hard as it can. That's not necessarily going to be the case. I'd 
rather not read data that is not going to be needed. You can stream a screen 
full or two in, but don't try to slurp up the whole data set.

Also it looks like you're trying to eventually have the whole data set in 
memory. ConcurrentTableUpdator.setTotalItems() does a "new Object[newTotal]" 
and a "System.arraycopy" to reallocate the knownObjects array. Two problems 
with this. First, suppose setTotalItems is called a bunch of times, each time 
with a total one more than the last time. You've got an O(n^2) algorithm then. 
Second problem, suppose you have a billion possible items in the table, and 
you've got a really fast data source, and the background thread is feeding you 
new rows like crazy. You're going to blow out your client memory limit very 
quickly.

I think of virtual tables as showing you a window on your data, not as a 
device for lazy loading. There's a subtle difference.

Before anybody says, showing a billion rows in a table is a bad user 
interface, I agree. Here's a use case. Say you have a database application 
that allows the user to enter a query and see the results. The user, perhaps 
unintentionally, enters a query that returns a billion rows. You don't want 
the system to lock up or beat up the database trying to return all those rows. 
Instead, you want to show the first so many of them and quiesce until he 
scrolls down to fetch more. By not eating the CPU, Memory, or Network, the 
system remains responsive in the typical case that the user will want to 
change the query and resubmit.

BTW, Updator should be renamed Updater for consistency with spelling elsewhere.

Comment 14 Stefan Xenos

2004-10-05 14:41:13 EDT

Ed, thanks for taking the time to examine the code. If you're interested in more
details about the algorithm, see bug 62002.

The code does not assume that you are looking at the top elements (it will
always zero in on whatever range you are currently viewing). However, you are
correct in that it assumes you have a data source that is reading in items and
dumping them into the viewer as fast as it can. 

It would be possible to only read a subset of the data from the data provider if
the viewer was not sorting the input it got from the content provider. In that
case, the Nth item from the content provider is also the Nth item in the viewer,
and the viewer can ask the content provider specificially for the items it
needs... but that is not the purpose of this algorithm. This algorithm is
specifically for the case where the viewer is sorting or filtering the input
coming from the content provider. In this case, there is no relationship betwen
the order of visible items and the order of items provided by the content provider.

For the case where you have an unsorted viewer with no filter, what you really
need is a way to replace ConcurrentTableManager itself (this is the object that
gets asked for items in a certain range and can choose not to compute items
outside that range). It would make sense to write a common base class or
interface for ConcurrentTableManager that clients can implement if they want to
do this type of thing. However, it would probably make sense to implement this
in a completely new TableViewer (since the current TableViewer, in general, uses
sorting and filtering and would have a different kind of content provider).

Basically, what I'm suggesting is this (excuse the stupid class names):

- NewTableViewer takes its input from a SubsetContentProvider.
SubsetContentProvider can return items in a particular range, and can be
subclassed to handle situations where the content provider doesn't need to
compute the entire set of elements in order to know the elements in the given
range (as with your database example).

- The current ConcurrentTableManager subclasses SubsetContentProvider. It talks
with an IConcurrentContentProvider, which returns unordered (and unfiltered)
items, sorts them, and filters them. Clients will subclass
IConcurrentContentProvider in all the situations where they would currently
subclass StructuredContentProvider (but the new version will give them
background sorting and filtering for free).

- The current TableViewer will use a NewTableviewer and a ConcurrentTableManager
internally to do pretty much what it does now (only much faster).

I *think* this should handle both your use-cases and mine quite efficiently.
SubsetContentProvider (the thing you'd subclass to implement your content
provider) would implement the methods setRangeOfInterest(int, int) and
refresh(). It would send its output asynchronously to something very much like
ConcurrentTableUpdator...

However, all this doesn't need to happen at once. We could use the new classes
in their current form (with package visibility) to make the current TableViewer
much faster... then we could expose new API allowing ConcurrentTableManager to
be replaced by a new content provider interface.

Does this make everyone happy? :-)

Comment 15 Ed Burnette

2004-10-05 15:02:20 EDT

Well I don't want to argue too much since you have code in hand. But I don't 
think a subsetting data provider is incompatible with sorting and filtering, 
especially with filtering. Clearly subclasses are the solution as you 
suggested but I don't want to have a special version of TableViewer, 
TreeViewer, and TableTreeViewer just for this. The strategy pattern of content 
providers should be sufficient to handle all these use cases. One content 
provider eagerly fetches all the data up front. One content provider gives you 
just enough data to get you started then eagerly fetches the rest in the 
background. And one content provider gives you just enough data to get you 
started, maybe a little read-ahead, and quiesces. The view is updated with 
whatever model data the content provider provides, modulo any filters and 
sorting that the view applies.

Comment 16 Stefan Xenos

2004-10-05 15:05:08 EDT

Re: comment 13

You can actually handle the "billion rows" case with the current algorithm using
the element limit in ConcurrentTableManager. You set the element limit to...
say... 100, and leave your content provider running in the background with low
priority querying the database for all billion rows. The viewer will apply its
sorting and only display the top 100 rows to the user (allowing everything else
to be garbage collected as it arrives).

The user can then change and resubmit their query, which will cancel the running
query start computing the new elements. The viewer will only hold onto 100
elements for any length of time, eliminating the danger of OOMEs (and making the
table easier to navigate).

Comment 17 Ed Burnette

2004-10-05 15:24:03 EDT

So I could stream a lot of data from the data source only so it could be 
discarded? Hmmm... nah. That's kind of like HTTP streaming of media files, 
except if I'm viewing the start of a movie it's likely I'm going to watch the 
movie all the way through. But, if I'm viewing part of a huge data set it's UN-
likely I'm going to scroll through all those rows.

Maybe your ConcurrentContentProvider would be more aptly named 
StreamingContentProvider.

Comment 18 Stefan Xenos

2004-10-05 15:42:23 EDT

Re: comment 15

I'm not trying to argue: I just want to be certain we understand your use cases
and have a good plan to solve them.

If that's all you're looking for, then you can already handle these use cases
with the algorithm I've attached here. The one you can't do is the content
provider that provides enough data to get started with read-ahead, sleeps, and
is then __woken up__ later when the viewer needs more information. 

The question is: how does the viewer know that it needs more information? As
long as it isn't sorting or filtering, it can simply ask the content provider
directly for the items at particular row numbers (as suggested in bug 21284)...
but if it is permuting (AKA sorting) the items in the viewer then what question
can it ask the content provider in order to request more elements, and when
should it ask that question? It can't identify elements by their sorted index
because the content provider has no knowledge of the sort order, and identifying
items by any other means wouldn't allow the viewer to request items currently in
the visible window.

I believe comment 12 would solve this use-case, but it would require two
different content provider interfaces (for ordered and unordered input). You
seem to be suggesting that one content provider could do both... if so, what
would it look like?

Comment 19 Stefan Xenos

2004-10-05 16:03:15 EDT

Re: comment 17

If you're sorting the data set, you can't know what the "first" 100 items are
unless you've seen the entire data set. The fact that you will discard and
garbage collect most of the rows doesn't mean you don't need to see them in the
first place.

Please clarify: are you saying you know of an algorithm that will allow us to
identify the items that will be visible in a particular range after sorting is
applied and request them from the content provider before we know the full set
of items? (If so, please specify details) Or are you saying that sorting isn't
interesting for your viewer and you want to take advantage of this in order to
only request items from the content provider as they become visible?

Comment 20 Ed Burnette

2004-10-05 16:58:21 EDT

Bug 21284 sounds like what I had in mind, thanks for the pointer. Structured 
viewers would query the content provider for ranges of elements and children, 
keyed by the unsorted, unfiltered row/child number. If there was a sort 
active, the viewer would have to keep querying until it had the whole data set 
or the action was aborted. If there was not a sort active then it could quit 
calling the content provider as soon as it had enough data to display (via 
LabelProvider and setData) all the currently visible elements. If the model 
changed, the content provider would call the viewer like it does now to make 
it update items that may or may not be currently known.

Maybe that's what you're doing but I'm having a hard time looking at the code 
and relating it to the existing TableViewer and IStructuredContentProvider. 
See also org.eclipse.target.internal.ui.DeferredContentProvider.

Just thinking "out loud" here but for more efficiency I've often thought that 
we needed something like an "ordered content provider" that pushes the sorting 
functionality from the viewer down into the content provider. Same goes for 
filtering I guess. For example, if the content is coming from a database, it's 
a simple matter to arrange for the data to come back in a certain order or 
with certain rows removed. Another example, if you're viewing a list of all 
compiler warnings from all projects in a workspace and you're filtering on a 
single project there's no point of even touching those other projects to 
enumerate the warnings and then throw them away. If you had one of these 
ordered content providers then you could query it by sorted/filtered row/child 
numbers which would be easier on the TableViewer.

Comment 21 Tod Creasey

2004-10-05 16:59:06 EDT

Note that Stefans work is more related to Bug 60117 although it may also be 
helpful in the virtual case.

Comment 22 Stefan Xenos

2004-10-05 17:23:26 EDT

I'm going to leave this in its current state for a bit: I need to catch up on
other PRs.

Comment 23 Tod Creasey

2004-10-15 10:17:38 EDT

Created attachment 15187 [details]
Updated table viewers and test suites

Here is a zip file with an updated TableViewer to handle Stefans required
changes and an updated version of ConcurrentTableTestView that uses it.

The main issue I have with this is the need for listening to the scroll bars -
we should decide on what the TableViewer needs to do to support this if it is
required - especially in the case where there are going to be updates because
of the SWT.VIRTUAL case.


I am going to look into the virtual case with this in mind and continue from
here.

Comment 24 Tod Creasey

2004-10-15 10:22:09 EDT

Stefan (and everyone else) I am going to reapply these patches and move the 
concurrency discussion to Bug 60177 as these turn out to be fairly seperate 
problems and I want to keep discussion focussed here on the virtual issues 
which are somewhat different from Stefans work.

Comment 25 Tod Creasey

2004-10-15 10:22:50 EDT

Sorry - Bug 60117

Comment 26 Tod Creasey

2004-10-15 16:38:49 EDT

Created attachment 15205 [details]
A version of the TableViewer to support virtual

I am obsoleting these other patches as they are now in Bug 60117. I am
attaching a work in progress version of TableViewer that supports SWT.VIRTUAL
but I will not be releasing it until Bug 76391 is dealt with and I have been
able to test this suffeciently.

Comment 27 Stefan Xenos

2004-10-15 16:45:05 EDT

Sounds good, Tod. I should have put these patches in bug 60117 to start with.

BTW, why do you need to fix bug 76391 to get SWT.VIRTUAL working? The SetData
callback shouldn't be necessary since you can use the current scroll position to
determine (much more efficiently) what needs to be populated.

Comment 28 Tod Creasey

2004-10-18 07:46:03 EDT

The lookup is O(1) for the remaining entries so whether I do it myself or 
after the callback there should be no difference in effeciency.

The issue with doing it myself is that all of the TableViewer code currently 
relies on there being a non null data (the Object we are representing) in all 
of the entries. It is a much lower risk (and more effecient) fix for me if I 
don't have to add null checking to all of the API that uses the elementMap.

Comment 29 Tod Creasey

2004-10-18 11:45:25 EDT

I have this ready now - it turns out I won't need any changes from SWT as it 
really only affected the new test suites I wrote.

In the interest of safety I am not going to release this until tomorrow 
(October 19) so that it does not affect the integration build.

Comment 30 Stefan Xenos

2004-10-18 19:36:11 EDT

Please explain: how can you implement SetData in O(1) time when indexof(...)
runs in O(n)?

Comment 31 Ohmimi Monar

2004-10-19 06:50:13 EDT

TableViewer.java line 718
>items[i].setImage(new Image[table.getItemCount()]);//Clear all images
seems to be a typo. I think
>items[i].setImage(new Image[table.getColumnCount()]);//Clear all images
makes more sense.

Comment 32 Tod Creasey

2004-10-19 08:05:20 EDT

It is a hashtable lookup. There would have to be a lot of collision in the 
hashcodes for O(n)

Comment 33 Tod Creasey

2004-10-19 09:03:32 EDT

I have released the virtual support with some new test suites into HEAD.

Comment 34 Tod Creasey

2004-11-03 11:33:29 EST

Verified in 20041102

Comment 35 Genady Beryozkin

2004-11-25 14:17:24 EST

Tod - is it possible to patch Eclipse 3.0.1 with the new TableViewer code?
Does this feature depends on 3.1 functionality of other classes / plugins?

Comment 36 Tod Creasey

2004-11-25 14:47:07 EST

I would be reluctant to as I would like it to get more use first and have some 
feedback from SWT (who are doing this now). Usually we only patch a release 
with important fixes to defects rather than features.