Bug 307098 - [Build] IncrementalProjectBuilder#getDelta should be useful with relaxed scheduling rules
Summary: [Build] IncrementalProjectBuilder#getDelta should be useful with relaxed sche...
Status: CLOSED WONTFIX
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 3.6   Edit
Hardware: PC All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Platform-Resources-Inbox CLA
QA Contact:
URL:
Whiteboard: stalebug
Keywords: helpwanted
Depends on: 198591 307282 307391
Blocks: 311205 289986
  Show dependency tree
 
Reported: 2010-03-25 12:09 EDT by James Blackburn CLA
Modified: 2019-11-27 10:24 EST (History)
29 users (show)

See Also:


Attachments
test 1 (22.86 KB, patch)
2010-03-29 12:22 EDT, James Blackburn CLA
no flags Details | Diff
patch 1 - Request for comment (21.26 KB, patch)
2010-03-29 12:47 EDT, James Blackburn CLA
no flags Details | Diff
patch 1 rebased to HEAD (20.33 KB, patch)
2010-04-22 05:24 EDT, Martin Oberhuber CLA
no flags Details | Diff
tests rebased to HEAD (8.07 KB, patch)
2010-04-23 06:20 EDT, James Blackburn CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description James Blackburn CLA 2010-03-25 12:09:02 EDT
+++ This bug was initially created as a clone of Bug #198591 +++

As per bug 198591 comment 40.  IncrementalProjectBuilder#getDelta can't be trusted with relaxed scheduling rules.  Any resource changes made while the build was taking place (and some builds can take a very long time) will not be seen by the next build when it does #getDelta.

There are a couple possible solutions:
  1) use a different workspace tree, or track the changes made by other operations separately (likely hard and unreliable).
  2) Redefine #getDelta, for reduced-sched-rule builders only, to be the delta since the *start* of the previous build.

Option 2 should be fairly straightforward and still provide a useful result.  Builders can distinguish between source and derived files. The builder only cares that it rebuilds all source files that have changed since the start of the last build -- if it ends up unnecessarily rebuilding one extra source file, that's a lesser price that forgetting to rebuild a file changed as part of the last build...
Comment 1 Martin Oberhuber CLA 2010-03-25 13:12:03 EDT
Hang on a little.

Why do you need correct deltas in the first place? Why isn't it acceptable to have EITHER correct deltas OR relaxed scheduling rules? In my mind, if you need correct deltas just don't relax the scheduling rules. I believe that we even made the Javadoc spec of Builder#getSchedulingRule() accordingly.
Comment 2 James Blackburn CLA 2010-03-25 13:16:24 EDT
(In reply to comment #1)
> Why do you need correct deltas in the first place?

In CDT ManagedBuild the delta is used to calculate:
  - Internal Builder: which resources need compiling
  - Makefile Builder: whether Makfiles need regenerating

For external Makfiles you're right, we don't care.  But for managed projects, which still take a non-trivial amount of time to build, we'd like deltas so that the build machinery builds the right stuff.

CDT's internal builder has got pretty close to incremental building; with a bit of love we could use both relaxed scheduling rules + trustworthy deltas to allow switching AUTO_BUILD on.
Comment 3 Martin Oberhuber CLA 2010-03-25 13:27:39 EDT
Fine! So just don't relax the scheduling rule for managed build.

I'm inclined to mark this WONTFIX. Or am I missing something?
Comment 4 Martin Oberhuber CLA 2010-03-25 13:29:02 EDT
(In reply to comment #2)
> with a bit of love we could use both relaxed scheduling rules + trustworthy
> deltas to allow switching AUTO_BUILD on.

Or, mark this as an enhancement request. But IMO it's not a bug. For now, relaxed scheduling rules work as per the spec.
Comment 5 James Blackburn CLA 2010-03-25 13:32:44 EDT
(In reply to comment #3)
> Fine! So just don't relax the scheduling rule for managed build.
> 
> I'm inclined to mark this WONTFIX. Or am I missing something?

But that doesn't satisfy the constraints outline in the second paragraph of comment 2.  Fundamentally C compilers (+ other tools) take a non-trivial time to run.  We don't want resources locked while they're running, so for a first class experience we need relaxed scheduling rules and ideally deltas so our build system can work correctly.

Does that make sense?
Comment 6 James Blackburn CLA 2010-03-25 13:33:01 EDT
It is marked as an enhancement.
Comment 7 John Arthorne CLA 2010-03-25 13:56:52 EDT
I think you're headed into difficult territory here. Say the delta returned the changes since the start of the last build, and your builder is able to figure out what parts of the delta were caused by itself. The next problem is that changes made concurrently with your builder won't cause another autobuild to run, so the user will end up in a state where there are unbuilt changes that autobuild is not picking up. 

Your builder would also need to handle the fact that the delta might not match the current project state, due to changes made in other threads after the call to getDelta(). So for example the delta might say foo.c was added, but when you try to read foo.c it has already been deleted. I have a hard time seeing how you can make a reliable incremental/auto builder under such conditions.
Comment 8 James Blackburn CLA 2010-03-25 14:11:34 EDT
Good point John.  

The problems you've outlined exists now using relaxed scheduling rules. It's just been pushed to the user who must now ensure the build-ery is consistent before a build.

What we're trying to do isn't conceptually hard: we're just transforming all sources along the tool-chain until we end up with a final Artifact.  The Java guys can do this incrementally without blocking users.  Even if we could get autobuild working, it might be too heavy-weight for users, but a good start would be allowing file editing while the managed build was running.

We have projects using CDT managedbuild now. The projects are quite small and modular, with clean build taking (just) 10mins.  In an ideal world we'd be able to use both Managedbuild (which we are) and have the performance that makefile builds now have with the relaxed rules.

In CDT we could fake up something today using resource change listeners, but if getting a sensible delta from core.resources isn't too hard that would seem the way to go.
Comment 9 James Blackburn CLA 2010-03-25 14:21:03 EDT
(In reply to comment #7)
> So for example the delta might say foo.c was added, but when you
> try to read foo.c it has already been deleted. I have a hard time seeing how
> you can make a reliable incremental/auto builder under such conditions.

BTW, AFAICS this is the major(/only?) pain point.  It's only adding and deleting resources that cause structural changes in the Makefiles which require them to be regenerated.  I'm pretty sure you're right:  if the user deletes a file after the build has started, that build will fail.  What's important is that on the next build, the makfiles are regenerated and don't still contain foo.c.

We certainly can't guarantee a perfect build everytime. But the assumption is that adding and removing resources is a sufficiently rare event, and it'll be even rarer that such resources are added / removed during the build that this may be a non-issue for users, compared with the pain of always having the workspace locked.
Comment 10 Martin Oberhuber CLA 2010-03-25 19:45:11 EDT
(In reply to comment #6)
> It is marked as an enhancement.

Then it shouldn't block other bugs, should it?
Comment 11 James Blackburn CLA 2010-03-25 19:52:06 EDT
Good point, yes.  It does block good CDT integration, but it's been this way for years so its still an enhancement. Cloning bugs with depends / blocks seems to do bad things :)

As an aside I've spent the day working on a patch (which I believe is clean) and addresses the flurry of bugs I filed today.  Just writing some tests for all these issues and will submit it all for you guys for feedback and review.
Comment 12 Martin Oberhuber CLA 2010-03-25 23:51:38 EDT
Well that's good news, then! - I didn't want to be overly religious with the "blocks" thing, surely one enhancement can block other enhancements to come. I was just confused when I saw this one "blocking" other issues which I had thought were already completed, such as bug 296800 and bug 305858. The "blocks" gave me the impression that you thought our solution to these items were incomplete or broken, which I don't think it is.
Comment 13 James Blackburn CLA 2010-03-29 12:22:47 EDT
Created attachment 163301 [details]
test 1

Test for the recent relaxed scheduling rules bugs filed.

Non-test addition changes:

builders testsuite:

1) DeltaVerifierBuilder#build(...) does:
		if (deltaWasEmpty) {
			//full build or empty delta
			verifier.reset();
the result is that subsequent #isDeltaValid returns true, even if the test was expecting changes! Fixing this doesn't break any tests (and correctly makes my delta test fail on HEAD).

2) TestBuilder: add hook class BuilderRuleCallback which can be hooked into by tests for getRule & build.

RelaxedSchedRuleBuilderTest added with:

- testBasicRelaxedSchedulingRules() - tests a simple null sched rule builder runs with a null rule on the project. Passes.

- testRelaxedDelta() - tests that, for builders with relaxed scheduling rule, #getDelta() contains changes made since the start of the last build

- testTwoBuildersSequentially() - run two builds sequentially, the second build should be blocked while the first is running (rather than returning OK_STATUS).

- testTwoBuildersRunInOneBuild() - tests that null and non-null scheduling rules work together.  Moreover tests that each builder runs with just the scheduling rule it requested, so null + Project doesn't become Workspace.
Comment 14 James Blackburn CLA 2010-03-29 12:47:45 EDT
Created attachment 163302 [details]
patch 1 - Request for comment

This is a patch which aims to fix the issues filed, I'd appreciate any and all comment!

The way this is intended to work is this:

1) A new buildLock has been added to buildManager. This lock ensures only one builder runs at any given time, and subsequent builds are queued behind the first. This lock, unlike the workspace lock, is held for the entire duration of the build.

Methods #prepareBuild(int trigger, monitor) and #endBuild() added to buildManager. They should be called to acquire the build lock before the PRE_BUILD notification, and released at the end of the build.

#prepareBuild must be called with the workspace lock held, and the WorkspaceRoot scheduling rule (as is currently required for PRE_BUILD notification). If it is unable to immediately acquire the build lock, it releases the workspace lock and yields the scheduling rule until it either
 - can acquire the build lock
or
 - is the blocked build is cancelled

2) The ISchedulingRules required for the build can now be acquired in #basicBuild before the builder is called. This means we only need to hold the relevant rules for the given builder. (The patch doesn't yet include the relaxing change to AutoBuild & Workspace).

Note this means that the build scheduling rule isn't held for the entire run of the build. The result is that, if some builders in the workspace have relaxed sched rules, resources may be modified by other jobs in the system.  I believe this is OK as getDelta is stored per builder and is now well-defined in both cases.

3) The delta returned by #getDelta is:
  - The delta since end of last build, if the builder uses the WorkspaceRoot rule
  - The delta since beginning of last build if the builder uses a relaxed scheduling rule.

This should make AUTO_BUILD useful with relaxed sched rules: if there haven't been any user changes since the last build, the builder can return without modifying resources.

All core.resources tests continue to pass, although there are a couple runtime issues: bug 307282 and bug 307391 which would block this.


The changes themselves are pretty localised, and I hope improve the usefulness of the builders API with relaxed scheduling rules. 

Do let me know what you think!
Comment 15 Martin Oberhuber CLA 2010-04-20 10:01:18 EDT
I've had a first look at the test and the patch, and I see that they include / require the patch from bug 306822, which is adding API to allow builders know context when they are asked for the scheduling rule. It doesn't look like that API is strictly required though - is it?

I'm very interested in making the Builders more robust and versatile, but I'm having bit of a hard time understanding all subtleties of this combined patch - and with M7 next week, it's very late in the game. 

James, do you see a chance separating the work into more digestable pieces? Please don't move forward splitting up just yet, just let me know what you think. 

The biggest issues with relaxed rules that I have seen in practical use so far was that multiple builds running in parallel resulted in odd results, and that we've broken client code which tried to wait for a build to finish with odd mechanisms (such as waiting for the Workspace rule to become available again, which is naturally bound to fail with relaxed rules). Your patch seems to address that by always queuing builds, so that part of the fix would be most interesting... could that be extracted into something smaller?

Another thought (which I know contradicts the previous statement to some extent) is that I'm wondering whether it's a good idea adding an implementation now which ensures that only one builder can run at any given time. I do see your "//TODO relax" comment in AutoBuildJob, but I'm still slightly concerned that if people start relying now on the Framework always doing sequential builds, it might do us a disservice in case we ever want to move towards parallel builds in the future. What do you think?

Other than that, on a superficial review I only see that 
- AutoBuildJob, Project, Workspace are missing Copyright comment notice
rest looks good.
Comment 16 James Blackburn CLA 2010-04-20 10:26:39 EDT
(In reply to comment #15)
> James, do you see a chance separating the work into more digestable pieces?
> Please don't move forward splitting up just yet, just let me know what you
> think. 

Yep, that's possible.  I agree this is late in the day, I was actually hoping to get a proper path and review for early in 3.7. 

It's not yet in my product, due to the runtime issues, I'd definitely like to get it stress tested there first.  There are also improvements to the patch I've got here. In particular:
  - Make acquiring the buildLock more efficient - this can probably be down outside the workspace scheduling rule (at least outside the Project#internalBuild grab of the scheduling rule)
  - Actually respect the order the builders wait to start build. Currently subsequent builds aren't queued, rather a random one is run.

At the moment I'm focusing on bugs CDT's CommonBuilder which are hurting us badly...

I'll respond to the other points separately.
Comment 17 James Blackburn CLA 2010-04-20 16:00:19 EDT
(In reply to comment #15)
> It doesn't look like that
> API is strictly required though - is it?

You're right. This patch includes the 'kitchen sink'. The patch contained everything I needed for fixing up and testing out  CDT...

> Please don't move forward splitting up just yet, just let me know what you
> think. 

Could do, yes, I think it depends on the timescale. There are indeed a few separate fixes in this patch, unfortunately as they're to the same file, it does produce an ordering problem in applying...  I have no problem splitting this up if it's worthwhile.

> and that
> we've broken client code which tried to wait for a build to finish with odd
> mechanisms (such as waiting for the Workspace rule to become available again,
> which is naturally bound to fail with relaxed rules).

I think clients which wait by trying to acquire the WS lock just need to be fixed. 

> Another thought (which I know contradicts the previous statement to some
> extent) is that I'm wondering whether it's a good idea adding an implementation
>...
> builds, it might do us a disservice in case we ever want to move towards
> parallel builds in the future. What do you think?

The lock is just an implementation detail to allow the BuildManager to continue to work correctly and provide a means of queuing the subsequent build requests. There's nothing stopping further improvements to allow it to schedule multiple builders concurrently, or anything else. It would need a fair bit of work though, and the API would no doubt need to be more expressive.
Comment 18 Martin Oberhuber CLA 2010-04-22 05:24:31 EDT
Created attachment 165717 [details]
patch 1 rebased to HEAD

This is the original patch rebased to HEAD, minus the committed API from bug 306822, plus Copyright comment updates. No new IP in here, so I don't obsolete patch 1 for now.
Comment 19 James Blackburn CLA 2010-04-23 06:20:35 EDT
Created attachment 165891 [details]
tests rebased to HEAD
Comment 20 Sebastian Zarnekow CLA 2012-12-04 12:15:48 EST
Is this something that can happend with scheduling rule == currently built project?
Comment 21 James Blackburn CLA 2012-12-04 17:39:51 EST
(In reply to comment #20)
> Is this something that can happend with scheduling rule == currently built
> project?

The answer is: 'it depends'.  If you only care about resources in the currently built project, then it will work as expected (as your thread is the only one which is allowed to change resources in the workspace while the build is in progress).

If you're trying to get changes from other, referenced projects, then you may indeed miss changes that were made while the build was in progress.
Comment 22 Lars Vogel CLA 2019-11-27 07:34:55 EST
This bug hasn't had any activity in quite some time. Maybe the problem got
resolved, was a duplicate of something else, or became less pressing for some
reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it.
The information can be, for example, that the problem still occurs, that you
still want the feature, that more information is needed, or that the bug is
(for whatever reason) no longer relevant.

If the bug is still relevant, please remove the stalebug whiteboard tag.