[imp-dev] Suggestions for improving the scheduling of analyses

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

[imp-dev] Suggestions for improving the scheduling of analyses

From: Stan Sutton <suttons@xxxxxxxxxx>
Date: Wed, 4 Jun 2008 13:22:59 -0400
Delivered-to: imp-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/imp-dev>
List-help: <mailto:imp-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/imp-dev>, <mailto:imp-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/imp-dev>, <mailto:imp-dev-request@eclipse.org?subject=unsubscribe>

Hi All,

I'd like to begin to do something to improve the mechanism by which analyses are scheduled (and run) in IMP. I'm talking about the analysis scheduling/invocation mechanism that is currently part of the UniversalEditor, where "analyzers" ( which might do anything) are invoked whenever the source text is (re)parsed.

Currently the analyses are invoked in UniversalEditor.ParserScheduler.notifyAstListeners(..). Each listener (which invokes an analyzer) has a required analysis level that indicates the level of analysis on which its associated analysis depends. Nominally, an analysis is invoked only if the level required by the analysis is less than the current level.

We recognize that this approach, while a step in the right direction, is significantly. As noted in comments in IModelListener, which defines the analysis levels:
// BROKEN!!!
// The following has no notion of the scope of analysis. E.g., providing a language
// service may require analysis within a much wider scope than a compilation unit
// (such as whole program analysis). Also, analyses don't really form a linear order.
And there are other concerns, for example, that a given analysis may depend on more than one type of analysis, and that the costs and benefits of an analysis are not considered.

Regarding the related issues that analyses may form a partial order, we could generalize the current mechanism in the following way:

Instead of having a single level of analysis that has been attained, the scheduler can have a collection of analyses that have been performed
As each successful analysis is completed, a record of it is added to the scheduler's set of performed analyses

Regarding the issues that an analysis may depend on multiple analyses of the same unit or on analyses of multiple units:

In the most general case, an analysis may depend not on some analysis level but on an arbitrary predicate. For instance, in IModelListener, we could replace "AnalysisRequired getAnalysisRequired()" with "boolean evaluatePreconditions()" (evaluatePreconditions() might be based on some set of analyses required or on something else).
As a compromise (or a step on the road to full generality) we could change "AnalysisRequired getAnalysisRequired()" to "Set<AnalysisRequired> getAnalysesRequired()."

Regarding the issues of costs and benefits:

Simply knowing that you can run an analysis doesn't tell you much about whether it's appropriate to run an analysis. For instance, suppose you have an AST and that enables you to run both a semantic analyzer and a spell-checker for comments. Probably you want to run the semantic analysis first (i.e., it has a higher benefit). But if the spell-checker can run in 1/100th the time as the semantic analyzer, then maybe you want to go ahead and run that just to get it out of the way. (I'm not suggesting what the right approach is, merely arguing that these considerations are relevant to deciding what you want to do.)

Further regarding the benefits, I think we can distinguish intrinsic versus extrinsic benefits. Intrinsic would be some level of benefit that you can assign to an analysis based solely on the value of its results, regardless of what other analyses may be available. Extrinsic benefits are those that depend on considerations beyond the analysis itself, such as what other analyses it might enable, the value of those analyses, etc. (which depends on the particular context).

Of course, there are a number of issues regarding the possible use of costs and benefits in analysis scheduling: they may be hard to compute precisely, they may be dynamic, their computation represents some additional cost (a kind of meta-analysis), they might be monitored (which would impose some additional cost), and the best way to use cost and benefit values in scheduling isn't clear (and might itself vary by context or in time).

Still, I think there are some simple steps we could take to begin to incorporate some of these elements into our analysis scheduling:

Forget the dynamic stuff to start with. We don't do anything dynamic now. Perhaps we could put in some placeholders for dynamic evaluation of costs and benefits but not implement them in a complicated way.
Adopt some simple scales for costs and benefits (kind of like bug severity in Bugzilla). Assume that we can derive some benefit if the person who specifies the cost and benefit levels gets those basically right most of the time.
Allow users to adjust the costs and benefits for particular analyses via preferences.
Devise a scheduling algorithm that is parameterizable in terms of the weight it puts on various factors, i.e., costs versus benefits (or costs versus intrinsic benefits versus extrinsic benefits).
Allow users to adjust the weights of the different factors via preferences.

If we can address these considerations, i.e., combine some more general treatment of analysis preconditions (possibly in terms of analyses required) and some concern for analysis costs and benefits, then I think we will gain another important element of customizability and will take a significant step toward making IMP IDEs "commercial strength."

Regards,

Stan

Stan Sutton, Ph. D.
IBM T. J. Watson Research Center
19 Skyline Drive, Hawthorne, NY 10532 USA
telephone: 1-914-784-7316, FAX: 1-914-784-7455, T/L 863
e-mail: suttons@xxxxxxxxxx, Stan Sutton/Watson/IBM@IBMUS

Follow-Ups:
- Re: [imp-dev] Suggestions for improving the scheduling of analyses
  - From: Robert M. Fuhrer
- Re: [imp-dev] Suggestions for improving the scheduling of analyses
  - From: Jurgen Vinju

Prev by Date: Re: [imp-dev] SDF, Box (and ASF+SDF Meta-Environment) deployment on Eclipse platform plan
Next by Date: Re: [imp-dev] Suggestions for improving the scheduling of analyses
Previous by thread: [imp-dev] import org.eclipse.jface.text.source.DefaultCharacterPairMatcher not found
Next by thread: Re: [imp-dev] Suggestions for improving the scheduling of analyses
Index(es):
- Date
- Thread

Breadcrumbs