Bug 542909 - Livelock rewording while building
Summary: Livelock rewording while building
Status: NEW
Alias: None
Product: TMF
Classification: Modeling
Component: Xtext (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows 10
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-19 07:10 EST by Ed Willink CLA
Modified: 2018-12-20 06:53 EST (History)
2 users (show)

See Also:


Attachments
Thread dump of livelock (58.78 KB, text/plain)
2018-12-19 07:10 EST, Ed Willink CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ed Willink CLA 2018-12-19 07:10:38 EST
Created attachment 276960 [details]
Thread dump of livelock

4.10RC1. 

While a build is in progress, which takes avery long time theses days, I reworded a commit in a project dependent on the buuilding project. Eclipse stalls.

Progress View shows building stuck at about 60%, and Commit (Waiting).

Thread dump attached.

Attempting to Restart Eclipse stall with Progress Dialog active.

Cancelling, cancels reword, but not the build. Have to resort to OS Task Manager.
Comment 1 Thomas Wolf CLA 2018-12-19 14:20:40 EST
Probably related to changes either in Platform Runtime relating to the workspace locking, or maybe the Xtext builder has a problem.

That EGit's "Reword" operation was waiting is 100% correct. It wasn't a UI lockout either since you were able to cancel the reword.

So why report this against EGit? Report it against Xtext or against Platform Runtime.
Comment 2 Ed Willink CLA 2018-12-19 16:05:54 EST
(In reply to Thomas Wolf from comment #1)
> So why report this against EGit? Report it against Xtext or against Platform
> Runtime.

Since it was an EGIT action that started the UI lockout.

The Eclipse recommendation is to report against a best guess and let the recipients use their probably better insight to reassign appropriately.
Comment 3 Thomas Wolf CLA 2018-12-20 02:27:47 EST
Moved to Xtext. I remember now a couple of months ago at a customer site I saw stuck Xtext builds rather frequently.
Comment 4 Christian Dietrich CLA 2018-12-20 02:31:47 EST
????
Comment 5 Christian Dietrich CLA 2018-12-20 02:32:40 EST
how was that thread dump created? dont see what is waited for
Comment 6 Thomas Wolf CLA 2018-12-20 03:48:35 EST
(In reply to Christian Dietrich from comment #4)
> ????

Exactly :-) I don't know what Ed was seeing. I can only give anecdotal evidence of what I saw at that customer: the build was _not_ blocked on some lock but was looping and constantly re-building the same file. (Not the _same_ file in different builds; different builds would get stuck on different files.) At the time I didn't pay much attention as this was a workbench with a rather special setup and some 50+ highly interdependent complex xtext DSLs, so I naturally assumed it was due to that special setup. It also didn't happen always; maybe once a week.
Comment 7 Thomas Wolf CLA 2018-12-20 03:54:49 EST
Possibly related: https://github.com/eclipse/xtext-eclipse/issues/348
Comment 8 Ed Willink CLA 2018-12-20 04:30:57 EST
(I also see regular Xtext builder issues, some of which appear to correspond to long open bugs.)

The thread dump was taken by starting, the using, jvisualvm while I was stuck with a wait cursor on the GIT reword dialog. Unusually this succeeded; usually jvisualvm hangs while trying to connect.

At the time of the dump Eclipse had become unuseable forcing termination by the OS. Perhaps the build was hung all by itself. My suspicion is that the livelock was triggered by the extra workload caused by GIT's gratuitous use of Rebase Interactive to thrash the workspace for commands such as reword that change nothing in the user's workspace.

(In reply to Thomas Wolf from comment #7)
> Possibly related: https://github.com/eclipse/xtext-eclipse/issues/348

In my experience there is a massive failure to aggregate actions.

a) GIT fails to lock out builds during a multi-project check-out; each project check-out starts a new build. Each step of a rebase starts a new build.

b) Numerous tools fail to aggregate marker changes so incur a workspace operation per attribute-per-marker. Rather than one per total change.

c) Worse numerous tools contribute non-changes as changes triggering build cycles for tools such as Xtext that depend on 'everything'.

Consequently every minor change causes a full rebuild that often doesn't cancel before the next minor change hits.

I often wonder whether I should go back to Eclipse 3.x just to see whether my memory of how much better it was is completely wrong. Bit if it's GIT/Xtext that are the cause the 3.x would be irrelevant.
Comment 9 Thomas Wolf CLA 2018-12-20 04:49:20 EST
Also possibly related https://github.com/eclipse/xtext-eclipse/issues/648
Comment 10 Thomas Wolf CLA 2018-12-20 05:08:46 EST
(In reply to Ed Willink from comment #8)
> a) GIT fails to lock out builds during a multi-project check-out;

What is a "multi-project check-out"? EGit does run checkouts as an IWorkspaceRunnable with a scheduling rule encompassing all Eclipse projects from the repositories on which a checkout is being done, and with IWorkspace.AVOID_UPDATE set.

> each project check-out starts a new build.

Again, what is a "project check-out"? What is a "project"? An Eclipse project or a git repository? With Egit, you check out commits from the repository, which affects all projects from that repository.

> Each step of a rebase starts a new build.

You should get builds only when the rebase stops, either on a conflict or when it's done. Rebasing also runs inside such an IWorkspaceRunnable.
Comment 11 Ed Willink CLA 2018-12-20 05:50:10 EST
(In reply to Thomas Wolf from comment #10)
> (In reply to Ed Willink from comment #8)
> > a) GIT fails to lock out builds during a multi-project check-out;
> 
> What is a "multi-project check-out"? 

Perhaps I should write multi-plugin. 

> EGit does run checkouts as an
> IWorkspaceRunnable with a scheduling rule encompassing all Eclipse projects
> from the repositories on which a checkout is being done, and with
> IWorkspace.AVOID_UPDATE set.

OK, then something circumvents. From observing in the progress view, I see builds starting long before a checkout completes.

(Surely the scheduling rule should encompass all dependent projects too. e.g. if I check out EMF, UML2 builds should be inhibited until EMF is stable.)
> 
> > each project check-out starts a new build.
> 
> Again, what is a "project check-out"? What is a "project"? An Eclipse
> project or a git repository? With Egit, you check out commits from the
> repository, which affects all projects from that repository.

A plugin/bundle as created by New->Project, so a "project". When I checkout from the GIT histopry to move from one multi-bundle state to another, I expect all files to be accurate on disk and in any local caches before any build starts. I see builds starting long before they are stable so that XText in particular thrashes and in some cases for significant changes, fails because a build occurred 'before' a dependency was refreshed. It is therefore sometimes necessary, particularly with Xtend to refresh/clean after a GIT checkout to make spurious errors go away. JUst occasionally it is even necessary to restart.

> > Each step of a rebase starts a new build.
> 
> You should get builds only when the rebase stops, either on a conflict or
> when it's done. Rebasing also runs inside such an IWorkspaceRunnable.

That would be wonderful, but clearly it is not happening for me. Perhaps Xtext is observing a file change and triggering a build while the rebase is in progress.
Comment 12 Thomas Wolf CLA 2018-12-20 06:53:14 EST
(In reply to Ed Willink from comment #11)
> (Surely the scheduling rule should encompass all dependent projects too.
> e.g. if I check out EMF, UML2 builds should be inhibited until EMF is
> stable.)
Not EGit, though. That's the builder's responsibility. Only the builder knows what it considers "dependent projects". So a UML2 build should run with a scheduling rule encompassing the EMF projects. An EGit checkout of the EMF bundles then gets a scheduling rule conflict and waits until the build is done. And if an EGit checkout is in progress, the build will wait until the checkout is done.