Bug 531048 - [GTK3] Large problems drawing ownerdraw tables
Summary: [GTK3] Large problems drawing ownerdraw tables
Status: REOPENED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: SWT (show other bugs)
Version: 4.8   Edit
Hardware: PC Linux
: P2 major (vote)
Target Milestone: ---   Edit
Assignee: Platform-SWT-Inbox CLA
QA Contact:
URL:
Whiteboard: stalebug RHT
Keywords: triaged
: 509503 (view as bug list)
Depends on:
Blocks: 531051 FixSwtSnippetsOnGtk3
  Show dependency tree
 
Reported: 2018-02-12 09:11 EST by Thomas Singer CLA
Modified: 2022-01-04 08:56 EST (History)
9 users (show)

See Also:


Attachments
Sample code to reproduce (3.26 KB, text/plain)
2018-02-12 09:12 EST, Thomas Singer CLA
no flags Details
screenshot on Fedora 27 (13.41 KB, image/png)
2018-02-12 09:12 EST, Thomas Singer CLA
no flags Details
screenshot on Fedora 27 moving mouse upwards over the table (15.24 KB, image/png)
2018-02-12 09:13 EST, Thomas Singer CLA
no flags Details
screenshot on Fedora 27 moving mouse downwards over the table (15.24 KB, image/png)
2018-02-12 09:13 EST, Thomas Singer CLA
no flags Details
screenshot on Fedora 27 with SWT_GTK3=0 (10.11 KB, image/png)
2018-02-12 09:15 EST, Thomas Singer CLA
no flags Details
Gtk3 (broken EraseItem) vs Gtk2 (working) (237.41 KB, image/png)
2018-02-22 14:53 EST, Leo Ufimtsev CLA
no flags Details
Bug snippet reproducing the issue in Tree (4.33 KB, text/x-java)
2019-03-13 15:51 EDT, Eric Williams CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Singer CLA 2018-02-12 09:11:24 EST
Please launch the attached snippet - I'm using Fedora 27. Parts of the cells are not drawn at all, other paints seem to overlap.
Comment 1 Thomas Singer CLA 2018-02-12 09:12:16 EST
Created attachment 272645 [details]
Sample code to reproduce
Comment 2 Thomas Singer CLA 2018-02-12 09:12:53 EST
Created attachment 272647 [details]
screenshot on Fedora 27
Comment 3 Thomas Singer CLA 2018-02-12 09:13:28 EST
Created attachment 272648 [details]
screenshot on Fedora 27 moving mouse upwards over the table
Comment 4 Thomas Singer CLA 2018-02-12 09:13:50 EST
Created attachment 272649 [details]
screenshot on Fedora 27 moving mouse downwards over the table
Comment 5 Thomas Singer CLA 2018-02-12 09:15:19 EST
Created attachment 272650 [details]
screenshot on Fedora 27 with SWT_GTK3=0
Comment 6 Leo Ufimtsev CLA 2018-02-12 09:27:21 EST
Thank you for bug submission and the snippet.

Reproduced with Gtk3.22/Fedora 27 on X11 & Wayland.
Doesn't reproduce on Gtk2.

We should investigate. I increased priority.
Comment 7 Leo Ufimtsev CLA 2018-02-15 13:28:50 EST
Issue occurs with all gtk3 versions. Gtk3.04 - 3.22.

Investigating...
Comment 8 Leo Ufimtsev CLA 2018-02-22 14:53:25 EST
Created attachment 272838 [details]
Gtk3 (broken EraseItem) vs Gtk2 (working)

Snippet273 shows an issue with SWT.EraseItem, which looks to be similar to this issue.
I.e, on gtk3 the background isn't drawn properly with respect to temperatures.

~Will continue to investigate.
Comment 9 Leo Ufimtsev CLA 2018-02-26 16:04:16 EST
Commenting out:
gtk_tree_view_column_set_visible(...false)
fixed the issue.

Also seems to fix this guy:
399522 – [GTK3] frequent get_column_number critical error spews
https://bugs.eclipse.org/bugs/show_bug.cgi?id=399522

I need to figure out why we're hiding/showing columns and why it breaks custom drawings.
Comment 10 Leo Ufimtsev CLA 2018-02-27 17:18:06 EST
This seems to be a regression after bugfix:
7752af94649ff3781401576ef108513585b255fb 
(commited in 2005, patch not in gerrit due to age).

As part of these two:
Bug 73812 – TableColumn.getWidth() returns 0 on gtk
Bug 51079 – Resizing of TableColumns on Linux displays extra horizontal toolbar

To be more precise:

Table.java:createItem(..)
if (columnCount == 0) {
..
++GTK.gtk_tree_view_column_set_visible (column.handle, false);
--GTK.gtk_tree_view_column_set_fixed_width(column.handle, 1);
}

There is a strange condition, when the first column is created, it's set to be invisible. (Kinda wonder why?).
Invisible columns seems to break custom painting on Gtk3.
setting column width to be at least 1 makes the problem go away (or not hiding columns).

~Will continue to investigate.
Comment 11 Leo Ufimtsev CLA 2018-03-07 17:23:02 EST
(In reply to Thomas Singer from comment #0)
> Please launch the attached snippet - I'm using Fedora 27. Parts of the cells
> are not drawn at all, other paints seem to overlap.

Btw, Thomas, do you see this behavior anywhere in the IDE?
Comment 12 Thomas Singer CLA 2018-03-08 04:31:44 EST
(In reply to Leo Ufimtsev from comment #11)
> (In reply to Thomas Singer from comment #0)
> > Please launch the attached snippet - I'm using Fedora 27. Parts of the cells
> > are not drawn at all, other paints seem to overlap.
> 
> Btw, Thomas, do you see this behavior anywhere in the IDE?

If you mean with "IDE" Eclipse, then no, I'm not using Eclipse. If I remember correctly, I've noticed this bug when trying to reproduce a different one with a snippet.
Comment 13 Leo Ufimtsev CLA 2018-03-09 17:26:36 EST
Ok, I think I figured out the cause:

If TreeColumn is invisible while Gtk3 is building it's cache (between 5-10th time g_main_context_iteration(..) is ran on my system), then a lot of Tree-initialization code is skipped because it's not ran for invisible controls which causes glitchy behavior by the drawing renders. 

The idea in Gtk3 is not to do work for invisible items, which is breaking some SWT code because it was built when there was no cache and invisible controls were fully rendered/computed even when they were not visible.

Fixing this is not quite straight forward. We either have to delay hiding (gtkColumnSetVisible(false)) of table/tree  until after caching is done, or not hide them.
We don't have a way to tell when caching is done. Hmm. Need to investigate further.
Comment 14 Thomas Singer CLA 2018-03-10 02:08:36 EST
Is there a way to tell GTK3 to not cache?
Comment 15 Leo Ufimtsev CLA 2018-03-12 09:16:02 EDT
(In reply to Thomas Singer from comment #14)
> Is there a way to tell GTK3 to not cache?

We asked gtk folks, they said no. Afaik there is also no signal like "finished caching" that we can use to finish our initialization.

Hmmmm. Will think about it.
Comment 16 Leo Ufimtsev CLA 2018-03-12 15:12:26 EDT
I setup a fedora 17 with Gtk3.4 (caching was introduced in Gtk3.6). The issue indeed doesn't occur.

I'll try to find the offending commit and ask gtk3 developers if we can force flush cache, turn it off or know when it's finished.
Comment 17 Leo Ufimtsev CLA 2018-03-19 15:43:17 EDT
The fix involved waiting for Gtk3 to compute it's cache before making additional tableColumns invisible.

In general, it seems that we should avoid keeping widgets invisible until gtk has build it's size cache / lazy initialized it's structures. The solution seems to entail piggy backing onto events that are only fired once this is done. (in this case draw signal). 
In general, if something seems to 'fix-itself' when you re-size the widget, chances are it's due to this caching business.

Marking as resolved, thank you for the bug submission.
Comment 18 Eric Williams CLA 2018-03-21 10:02:44 EDT
*** Bug 509503 has been marked as a duplicate of this bug. ***
Comment 19 Thomas Wolf CLA 2018-03-26 09:21:01 EDT
I suspect the "fix" for this issue ( https://git.eclipse.org/r/#/c/119588/ ) might be the cause for bug 532831. EGit's "Rebase Interactive" view is missing its fixed-width columns.
Comment 20 Leo Ufimtsev CLA 2018-03-27 10:17:51 EDT
(In reply to Thomas Wolf from comment #19)
> I suspect the "fix" for this issue ( https://git.eclipse.org/r/#/c/119588/ )
> might be the cause for bug 532831. EGit's "Rebase Interactive" view is
> missing its fixed-width columns.

~Investigating.
Comment 21 Eclipse Genie CLA 2018-04-02 11:08:19 EDT
New Gerrit change created: https://git.eclipse.org/r/120582
Comment 22 Leo Ufimtsev CLA 2018-04-02 11:13:08 EDT
(In reply to Eclipse Genie from comment #21)
> New Gerrit change created: https://git.eclipse.org/r/120582

I'll temporarily revert the patch until I've had time to come up with a better patch.

Atm I need to investigate a webkit crash:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=494158

Will come back to this issue later.
Comment 24 Leo Ufimtsev CLA 2018-06-29 16:14:13 EDT
Didn't manage to get around to improving the fix on this issue. I'm transferring into consulting, returning back to main queue.
Comment 25 Eclipse Genie CLA 2019-02-26 11:54:42 EST
New Gerrit change created: https://git.eclipse.org/r/137655
Comment 27 Eric Williams CLA 2019-03-11 14:45:13 EDT
(In reply to Eclipse Genie from comment #26)
> Gerrit change https://git.eclipse.org/r/137655 was merged to [master].
> Commit:
> http://git.eclipse.org/c/platform/eclipse.platform.swt.git/commit/
> ?id=092d68d1c71497fe32622e65742008ee50a1bac2

I believe we finally have a fix for this one...in master now.
Comment 28 Andrey Loskutov CLA 2019-03-13 05:00:57 EDT
(In reply to Eric Williams from comment #27)
> (In reply to Eclipse Genie from comment #26)
> > Gerrit change https://git.eclipse.org/r/137655 was merged to [master].
> > Commit:
> > http://git.eclipse.org/c/platform/eclipse.platform.swt.git/commit/
> > ?id=092d68d1c71497fe32622e65742008ee50a1bac2
> 
> I believe we finally have a fix for this one...in master now.

Unfortunately this causes a regression in Platform UI tests:

https://download.eclipse.org/eclipse/downloads/drops4/I20190312-1800/testresults/html/org.eclipse.ui.tests_ep412I-unit-cen64-gtk3-java8_linux.gtk.x86_64_8.0.html

expected:<1> but was:<2> 

junit.framework.AssertionFailedError: expected:<1> but was:<2>
at junit.framework.Assert.fail(Assert.java:57)
at junit.framework.Assert.failNotEquals(Assert.java:329)
at junit.framework.Assert.assertEquals(Assert.java:78)
at junit.framework.Assert.assertEquals(Assert.java:234)
at junit.framework.Assert.assertEquals(Assert.java:241)
at junit.framework.TestCase.assertEquals(TestCase.java:409)
at org.eclipse.jface.tests.viewers.SimpleVirtualLazyTreeViewerTest.processEventsUntilElementUpdated(SimpleVirtualLazyTreeViewerTest.java:233)
at org.eclipse.jface.tests.viewers.SimpleVirtualLazyTreeViewerTest.testRemoveAt(SimpleVirtualLazyTreeViewerTest.java:212)
Comment 29 Eric Williams CLA 2019-03-13 08:56:25 EDT
(In reply to Andrey Loskutov from comment #28)
> (In reply to Eric Williams from comment #27)
> > (In reply to Eclipse Genie from comment #26)
> > > Gerrit change https://git.eclipse.org/r/137655 was merged to [master].
> > > Commit:
> > > http://git.eclipse.org/c/platform/eclipse.platform.swt.git/commit/
> > > ?id=092d68d1c71497fe32622e65742008ee50a1bac2
> > 
> > I believe we finally have a fix for this one...in master now.
> 
> Unfortunately this causes a regression in Platform UI tests:
> 
> https://download.eclipse.org/eclipse/downloads/drops4/I20190312-1800/
> testresults/html/org.eclipse.ui.tests_ep412I-unit-cen64-gtk3-java8_linux.gtk.
> x86_64_8.0.html
> 
> expected:<1> but was:<2> 
> 
> junit.framework.AssertionFailedError: expected:<1> but was:<2>
> at junit.framework.Assert.fail(Assert.java:57)
> at junit.framework.Assert.failNotEquals(Assert.java:329)
> at junit.framework.Assert.assertEquals(Assert.java:78)
> at junit.framework.Assert.assertEquals(Assert.java:234)
> at junit.framework.Assert.assertEquals(Assert.java:241)
> at junit.framework.TestCase.assertEquals(TestCase.java:409)
> at
> org.eclipse.jface.tests.viewers.SimpleVirtualLazyTreeViewerTest.
> processEventsUntilElementUpdated(SimpleVirtualLazyTreeViewerTest.java:233)
> at
> org.eclipse.jface.tests.viewers.SimpleVirtualLazyTreeViewerTest.
> testRemoveAt(SimpleVirtualLazyTreeViewerTest.java:212)

Yes I am aware and working on it. I think there is something wrong with this test, as it fails every time (without my patch) on Wayland. This is one those test cases where it drains the event queue in a loop and checks if some callback count has been reached. It will likely require some adjustment.
Comment 30 Andrey Loskutov CLA 2019-03-13 09:05:11 EDT
(In reply to Eric Williams from comment #29)
> Yes I am aware and working on it. I think there is something wrong with this
> test, as it fails every time (without my patch) on Wayland. 

On X11 it works fine.

> This is one
> those test cases where it drains the event queue in a loop and checks if
> some callback count has been reached. It will likely require some adjustment.

It receives one call more now, which could be a potential performance issue.
Comment 31 Eric Williams CLA 2019-03-13 09:10:19 EDT
(In reply to Andrey Loskutov from comment #30)
> It receives one call more now, which could be a potential performance issue.

I haven't determined yet if it's one extra call, or if there is some call coming too late. I'll keep investigating.
Comment 32 Eric Williams CLA 2019-03-13 15:51:00 EDT
Created attachment 277856 [details]
Bug snippet reproducing the issue in Tree

Found out what the issue is. Some background:

Calling gtk_style_context_invalidate() clears some internal GTK caches. This SWT bug was introduced in GTK sometime between GTK3.16 and 3.18, when they changed the way GtkStyleContext invalidation changes. The fix is call gtk_style_context_invalidate() to force GTK to clear its cache upon the tree model changing.

In SimpleVirtualLazyTreeViewerTest.testRemoveAt(), we spin the event loop to drain any remaining events, then do a check on the number of calls to update an element in the Tree. If the number of calls is 1, we break out from the event queue draining and do an assert to double check the number of update element calls.

The problem:

GTK cache clears happen only on GTK idle, and by this I mean only when there are no events to process. This becomes an issue because once the event queue draining stops in the tests, GTK invalidates the cache. The cache invalidation can lead to an additional setData() event, which increments the updated element call count in the test suite. This means the condition to break out of the event draining won't be tripped, and the display loop will spin until the timeout period is reached. At that time, the event draining stops and an assert is called. The assert will fail since the update element call count will be 2.

The solution:

I believe a good solution is to update the test suite to accommodate an extra setData event. Instead of checking if the call count == 1, we can do call count <= 2. These changes would affect Linux only. IMO checking callback counts and spinning the event loop is already inaccurate enough, that checking for a single extra event due to a delayed cache clear is reasonable.

Performance:

I was a little concerned about performance as well so I gave YourKit a spin and profiled the bug snippet attached (both a Tree and Table version). The total performance difference between pre-patch and post-patch is negligible -- less than 1% or 2ms. 

I also kept an eye on the display loop runtime and the Tree/Table.setItemCount() runtime. Tree.setItemCount() performance dereased by ~6% -- an extra ~50ms on average. Table.setItemCount() performance improved by ~5% as a result of this patch. 

Display loop performance also improved in Tree by ~25%, and decreased in Table by ~10%. Of course in absolute terms the largest time difference (for all of these methods) is no more than 60ms, so that is important to consider when looking at the percentage difference.


I'll prepare a Gerrit with the test suite changes.
Comment 33 Eclipse Genie CLA 2019-03-13 15:59:12 EDT
New Gerrit change created: https://git.eclipse.org/r/138700
Comment 35 Eric Williams CLA 2019-03-13 17:02:16 EDT
(In reply to Eclipse Genie from comment #34)
> Gerrit change https://git.eclipse.org/r/138700 was merged to [master].
> Commit:
> http://git.eclipse.org/c/platform/eclipse.platform.ui.git/commit/
> ?id=9ad1aa6c99ec92d5e68888068adc85196ef7d21b

Thanks for the merge Andrey!
Comment 36 Eric Williams CLA 2019-04-09 09:43:45 EDT
Verified in I20190409-0600.
Comment 37 Jens Lideström CLA 2019-09-17 16:21:05 EDT
(In reply to Eric Williams from comment #32)
> In SimpleVirtualLazyTreeViewerTest.testRemoveAt(), we spin the event loop to
> drain any remaining events, then do a check on the number of calls to update
> an element in the Tree. If the number of calls is 1, we break out from the
> event queue draining and do an assert to double check the number of update
> element calls.

This test intermittently fails in Jenkins builds for the GTK platform. For example in these builds [1]. It fails at the following statement:

if (eventLoopAdjustmentBug531048) {
	assertTrue(updateElementCallCount <= 2);
} ...

Could this be a related problem to what Eric is describing? Is someone that was involved in this bug well suited to investigate the current problems with SimpleVirtualLazyTreeViewerTest?

PS: Also some other tests have been failing, such as these:

SimpleVirtualLazyTreeViewerTest#testNoSelectionRefresh
VirtualLazyTreeViewerTest#testExpandToLevel

[1]: 
https://download.eclipse.org/eclipse/downloads/drops4/I20190912-1800/testresults/html/org.eclipse.jface.tests_ep414I-unit-cen64-gtk3-java8_linux.gtk.x86_64_8.0.html#org.eclipse.jface.tests
https://download.eclipse.org/eclipse/downloads/drops4/I20190914-1800/testresults/html/org.eclipse.jface.tests_ep414I-unit-cen64-gtk3-java8_linux.gtk.x86_64_8.0.html#org.eclipse.jface.tests
https://download.eclipse.org/eclipse/downloads/drops4/I20190915-1800/testresults/html/org.eclipse.jface.tests_ep414I-unit-cen64-gtk3-java11_linux.gtk.x86_64_11.html#org.eclipse.jface.tests
https://download.eclipse.org/eclipse/downloads/drops4/I20190916-1045/testresults/html/org.eclipse.jface.tests_ep413I-unit-cen64-gtk3-java8_linux.gtk.x86_64_8.0.html#org.eclipse.jface.tests
Comment 38 Jens Lideström CLA 2019-09-17 16:22:28 EDT
Someone opened up this old ticket as a reaction to the failing test cases: Bug 412569
Comment 39 Eric Williams CLA 2019-09-17 16:26:26 EDT
(In reply to Jens Lideström from comment #37)
> (In reply to Eric Williams from comment #32)
> > In SimpleVirtualLazyTreeViewerTest.testRemoveAt(), we spin the event loop to
> > drain any remaining events, then do a check on the number of calls to update
> > an element in the Tree. If the number of calls is 1, we break out from the
> > event queue draining and do an assert to double check the number of update
> > element calls.
> 
> This test intermittently fails in Jenkins builds for the GTK platform. For
> example in these builds [1]. It fails at the following statement:
> 
> if (eventLoopAdjustmentBug531048) {
> 	assertTrue(updateElementCallCount <= 2);
> } ...
> 
> Could this be a related problem to what Eric is describing? Is someone that
> was involved in this bug well suited to investigate the current problems
> with SimpleVirtualLazyTreeViewerTest?
> 
> PS: Also some other tests have been failing, such as these:
> 
> SimpleVirtualLazyTreeViewerTest#testNoSelectionRefresh
> VirtualLazyTreeViewerTest#testExpandToLevel
> 
> [1]: 
> https://download.eclipse.org/eclipse/downloads/drops4/I20190912-1800/
> testresults/html/org.eclipse.jface.tests_ep414I-unit-cen64-gtk3-java8_linux.
> gtk.x86_64_8.0.html#org.eclipse.jface.tests
> https://download.eclipse.org/eclipse/downloads/drops4/I20190914-1800/
> testresults/html/org.eclipse.jface.tests_ep414I-unit-cen64-gtk3-java8_linux.
> gtk.x86_64_8.0.html#org.eclipse.jface.tests
> https://download.eclipse.org/eclipse/downloads/drops4/I20190915-1800/
> testresults/html/org.eclipse.jface.tests_ep414I-unit-cen64-gtk3-java11_linux.
> gtk.x86_64_11.html#org.eclipse.jface.tests
> https://download.eclipse.org/eclipse/downloads/drops4/I20190916-1045/
> testresults/html/org.eclipse.jface.tests_ep413I-unit-cen64-gtk3-java8_linux.
> gtk.x86_64_8.0.html#org.eclipse.jface.tests


Most of these Tree/Table based tests are waiting for some number of events to be processed, and then a check is done. Usually on GTK the solution is to increase the timeout/event queue time, so that the events have more time to process. GTK likes to cache things internally so that's why sometimes this is needed.

I'm not a JFace expert, but looking at the tests that are failing I'd say a similar adjustment needs to be in those cases. Of course it's not easy to nail down since the failures don't always happen locally, or even reliably on the build machines.
Comment 40 Eric Williams CLA 2019-11-13 16:12:50 EST
Reopened because of the regression caused in bug 552830.
Comment 41 Eric Williams CLA 2019-12-16 14:37:17 EST
I didn't manage to work on this issue, tossing it back into the pool.
Comment 42 Eclipse Genie CLA 2021-12-23 15:03:55 EST
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're closing this bug.

If you have further information on the current state of the bug, please add it and reopen this bug. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.
Comment 43 Thomas Singer CLA 2022-01-03 04:53:04 EST
Please reopen. This is still reproducible.
Comment 44 Eric Williams CLA 2022-01-04 08:56:33 EST
(In reply to Thomas Singer from comment #43)
> Please reopen. This is still reproducible.

Reopening.