Bug 520440 - [pivot] Introduce TypedElement.isNonNull/isValid
Summary: [pivot] Introduce TypedElement.isNonNull/isValid
Status: NEW
Alias: None
Product: OCL
Classification: Modeling
Component: Core (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows NT
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: OCL Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 303686 552782 (view as bug list)
Depends on: 520508 561146 574166 581471
Blocks: 507628 509616 509842 518573 519861 575412 578060 583242
  Show dependency tree
 
Reported: 2017-08-02 03:40 EDT by Ed Willink CLA
Modified: 2024-05-10 06:43 EDT (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ed Willink CLA 2017-08-02 03:40:55 EDT
Bug 518709 motivated introduction of the Invalidable<Nullable<T>> types to more accurately model type inflation (maybe-null, maybe-invalid) and avoid coding asynchronisms between TypedElement.type and TypedElement.isRequired. The different discovery times of the type and its inflation actually created greater problems, so Bug 518709 is WONTFIX. This bug pursues a better approach.

Underlying problem: partial collection/iteration evaluation is only possible when the residue of the evaluation is provably not-invalid. Therefore a compile-time analysis should determine and persist validity. Since invalid is the result of null-navigation, divide-by-zero or index-out-of-bounds, the analysis needs a limited symbolic capability to reason about true/null/zero/size(). The null reasoning is the same as required for not-null determination, so the FlowAnalysis can be enhanced from guard/implies logic to a partial evaluator and perhaps even lead into Common Subexpression Elimination.

The analysis needs to propagate known-values up the tree but only needs to persist isRequired/isValid. Difficulties with type/isRequired coding asynchronism should go away if the AST builder populates the type and the subsequent analysis infers isRequired/isValid.

Currently type/isRequired evolve. Initially they are determined from perhaps an explicit let-variable declaration and later refined from the initializer analysis. The evolution seems like a lazy coding practice that may even be a bug; if anything uses the unrefined value, its use may need refreshing after refinement; the unrefined value should not be assigned. Persisting the refined value in the AST has the unfortunate consequence that an AS2CS conversion will use the refined type inhibiting a precise CS2AS2CS round-trip.

For consistency and avoidance of double negatives, introduce maybeInvalid and invert isRequired as maybeNull; both with false XMI serialization defaults.
Comment 1 Ed Willink CLA 2017-08-02 13:15:26 EDT
(In reply to Ed Willink from comment #0)
> For consistency and avoidance of double negatives, introduce maybeInvalid
> and invert isRequired as maybeNull; both with false XMI serialization
> defaults.

isRequired/isMany are Eclipse OCL additions. UML has MultiplicityElement::isMultivalued. Renaming isRequired should avoid the ambiguity wrt Collection values; is the Collection non-null or the non-null content non-empty.
Comment 2 Ed Willink CLA 2017-08-03 09:58:13 EDT
Once we stop setting isRequired in the LeftToRight visitor and instead set it in the PostProcessor we hit a problem.

Given "let y : Integer[?] = .... in let x : Integer[1] = y in ..." to what extent is the wrong x multiplicity corrected. Previously the influences get merged. Now the explicit [1] gets lost.

a) declarations could be left as-is rather in the style of Operation (declared type)/OperationCallExp(actual/static type) so that Variable is declared, VariableExp is actual. This could resurrect round-tripping for Bug 520441. AS validation reports source level deficiencies.

b) Variable declarations could be maximally accurate/inferred. Validation precision is awkward.
Comment 3 Ed Willink CLA 2017-08-04 03:10:36 EDT
(In reply to Ed Willink from comment #2)
> Now the explicit [1] gets lost.

Changing Iterator/ResultVariable to the declared types fixes one test failure and we are left with one failure in a diagnostic that now report Set(Class[*|?]) rather than Set(Class). While this is trivial it hints at deep trouble; nested collection types and other non-null subtleties are less accurate in the Left2Right pass and so will require both type and maybeNull fix-up in the PostProcessor pass.

Simplifying the Left2Right pass seems unjustified.

Whether to make declarations just declared is a separate issue. Leave to Bug 520441.

Eventually the CS2AS model should provide single assignment throughout the AS. A further PostProcessor transformation can provide in-place refinement.
Comment 4 Ed Willink CLA 2017-08-11 16:03:45 EDT
SUSPENDED: branch ewillink/520440 has major version API breakage that it is not clear that we need yet. There is a postprocssor but it doesn't do very much apart from recomputing maybeNull consistently.

Todo, uncomment the testFlowAnalysis_SimpleIfNameGuard to force development and use of the postprocessor to propagate non-null better.

Todo non-zero, size()-relatibve integers

Todo refactor isRequired to at least maybeNull, maybeInvalid.
Comment 5 Ed Willink CLA 2020-02-27 11:00:07 EST
*** Bug 552782 has been marked as a duplicate of this bug. ***
Comment 6 Ed Willink CLA 2020-02-27 11:43:57 EST
(In reply to Ed Willink from comment #3)
> Simplifying the Left2Right pass seems unjustified.

A lazy post-Analysis like the FlowAnalysis seems appropriate, but whereas the FlowAnalysis has a Deducer that works down, surely we need a symbolic partial evaluator for the whole expression that propagates contraints such as if-condition-is-false down in an extended evaluation environment so that the symbolic result can propagate up.

(In reply to Ed Willink from comment #4)
> SUSPENDED: branch ewillink/520440 has major version API breakage

The breakage appears to be due to ease of prototyping; EvaluationVisitorExtension is folded into EvaluationVisitor. Surely a lazy post-Analysis could be API compatible - only adding functionality. Notionally just a stronger EnvironmentFactoryInternal.createFlowAnalysis. API-compatibly add EnvironmentFactoryExtension3.createSymbolicEvaluator.
Comment 7 Ed Willink CLA 2020-03-14 07:48:19 EDT
(In reply to Ed Willink from comment #6)
> surely we need a symbolic partial evaluator for the whole expression

This requires substantial duplication of the existing BasicEvaluationVisitor whose fuctionality is in many cases correct. Also duplication of JUnit tests. New API for Operation dispatch/evaluate.

If the existing evaluator which just returns Object in many cases, could return an UnknownValue or ConstrainedValue  much of the BasicEvaluationVisitor might be rescued. But not by derivation since that requires QVTr to rederive unpleasantly.

e.g. consider

NumericMinusOperation.evaluate(@Nullable Object left, @Nullable Object right) {
	RealValue leftNumeric = asRealValue(left);
	RealValue rightNumeric = asRealValue(right);
	return rightNumeric.commutatedNegate(leftNumeric);
}

the 'crash' in asRealValue needs to be tamed so that if symbolic evaluation is required a usage as "if x->size() - 4 = 0 then x->at(x->size() - 3) else null endif" must allow for the condition, the known result to back-deduce that x->size = 4 for the then condition so that the forward evaluation x->size() -3 is 1 and in-bounds for at().

Hm. This seems like a lot of extra functionality.

The back-deduction must be a new API to pass in the known result. Default functionality just gives don't-know.

The enhanced forward evaluation might survive substantially unchanged if there is a symbolic-RealValue. RealValue is an interface to support variant representations so a SymbolicRealValue could be compliant. Even commutatedNegate seems almost designed to allow both arguments to have an opportunity to be symbolic.
Comment 8 Ed Willink CLA 2020-03-26 05:12:26 EDT
(In reply to Ed Willink from comment #7)
> If the existing evaluator which just returns Object in many cases, could
> return an UnknownValue or ConstrainedValue  much of the
> BasicEvaluationVisitor might be rescued.

Unforrtunately many of the implementation such as OclAnyEqualOperation have a sensibly derived retirn type such as Boolean that improves CG code quality. However for symbolic evaluation we must return e.g. a SymbolicBooleanValue which cannot conform to the final Boolean type. Oops need to return Object.

Auto-gen modifed to allow a new INSTANCE2 to direct to an Object return without breaking API on the old INSTANCE that still directs at the dericed API.

INSTANCE2 may prove useful bot we don't need it. The derived return is on a narrow derived evaluate signature, which the evaluator has reached by redirecting from the outer signature that returns Object anyway. Implementations can therefore have the existin g narrow implementation for CG usage and the wide implementation for symbol generality.
Comment 9 Ed Willink CLA 2020-04-09 03:28:32 EDT
(In reply to Ed Willink from comment #8)
> However for symbolic evaluation we must return e.g. a
> SymbolicBooleanValue which cannot conform to the final Boolean type.

This proves to be pervasive; every ValueUtil.asBoolean() needs a rewrite to handle a symbolic Boolean. OCL3 could have a uniform API.
Comment 10 Ed Willink CLA 2020-04-11 01:14:55 EDT
(In reply to Ed Willink from comment #9)
> This proves to be pervasive; every ValueUtil.asBoolean() needs a rewrite to
> handle a symbolic Boolean. OCL3 could have a uniform API.

OCL3 may simplify the AS/CS moetamodel; it does not necessarily change the run-time policy although a major version change provides an opportunity for a clean up. Chnaging the current boxed/unboxed compromises should not be done lightly.

Nearly all of the 22 calls to ValueUtil.asBoolean(), 12 calls to ValueUtil.asIntegerValue()  and 24 calls to ValueUtil.asRealValue() indicate where the logic must be revised to accommodate SymbolicValue. The current API may be a nuisance but it shouldn't be a fundamental impediment to researching the functionality.
Comment 11 Ed Willink CLA 2020-04-11 03:52:09 EDT
(In reply to Ed Willink from comment #10)
> Chnaging the current boxed/unboxed compromises should not be done lightly.

IntegerValue is already a boxed and has minimal differece as a wrapper for int.

Similarly BooleanValue could wrap boolean with minimal cost wrt Boolean. We already value ValueUtil.TRUE/FALSE to provide @NonNull. In principle just change to the equivalent BooleanValue - but the API breakage would be major.

(Futre thoughts:
We are left with no-StringValue, no-ObjectValue, null/NullValue.

StringValue looks like undesirable bloat. One day perhaps a limited re-implementation of String as just a char[] + lazy-hash wrapper might be practical for internal use. 

ObjectValue is currently a very very undesirable bloat, but might come for free with custom slotted EObjects.

NullValue is currently run-time prohibited. Might actually be a good idea so that "null" is Java code gone wrong, NullValue is correct OCL evaluation. With slotted EObjects that 'null' could be NullValue from the outset.

Therefore could get to 100% Value.
)
Comment 12 Ed Willink CLA 2020-05-19 05:58:43 EDT
(In reply to Ed Willink from comment #0)
> For consistency and avoidance of double negatives, introduce maybeInvalid
> and invert isRequired as maybeNull; both with false XMI serialization
> defaults.

We need clear names for at least two different use cases:

a) local predicates e.g. isNull => this instanceof NullLiteralExp
b) symbolic deductions e.g. isNull => deep local isNull

After replacing the FlowAnalysis by the SymbolEvaluationVisitor there is a stackoverflow as isNull usage is ambiguous.

At least stop the symbolic analysis making matters worse by using symbolic{Is/MayBe}{Invalid/Null} as deep boolean predicates and perhaps symbolic{Invalidity/Nullity} if a three-valued return is really needed. Behind the scenes the symbolic analysis is 2-valued e.g. mayBeNull since the third isNull case is accessible as symbolValue instanceof NullLiteral. isInvalid is a bit harder since SymbolicValue inherits InvalidValue - maybe we can avoid this.
Comment 13 Ed Willink CLA 2020-05-20 04:25:28 EDT
(In reply to Ed Willink from comment #12)
> After replacing the FlowAnalysis by the SymbolEvaluationVisitor there is a
> stackoverflow as isNull usage is ambiguous.

This is a consequence of and2/implies2/or2 not being pure short-circuit. There is a bogus peak for a statically determinate second term and it is this that goesrecursive.
Comment 14 Ed Willink CLA 2021-04-18 04:16:32 EDT
Resuming with a SymbolicAnalysis after Xtext distractions.

Validating all ExpressionInOCL gives 70 test failures since tests were designed for test rather than universal values. Therefore disable for test, but cannot disable for UsageTests where GenModel is not readily overrideable. Therefore some bad test models need fining. e.g. Bug 370824. Test expressions can be revisited once the SymbolicAnalysis hnadles deductions/contradictions.
Comment 15 Ed Willink CLA 2021-04-28 03:58:44 EDT
*** Bug 303686 has been marked as a duplicate of this bug. ***
Comment 16 Ed Willink CLA 2021-06-12 03:41:48 EDT
Bug 574166 hghlights some corruption corrolaries of the symbolic validation reifying embedded library operation bodies.
Comment 17 Ed Willink CLA 2021-11-18 14:40:38 EST
Getting there. Now has a mayBeInvalidReason() for use by TypedElementImpl.validateUnconditionallyValid(..). But how invalid is invalid?

For normal well-intentioned, strict OCL, the code should be invalid free, so we just provide a diagnosis where crash hazards exist.

But for ill-intentioned code where invalid is required, how to we help but also not irritate the user? Trivially, it would seem that if the user explicitly uses the invalid literal the defined syntax element should not be a warning. This logically extends to all consequencea of the explicit usage. This gets complicated/irregular and unhelpful.

Perhaps we avoid warnings for invalid 'caught' by oclIsInvalid() and perhaps for a declared 'invalidating' return. Without extra syntax sugar the "if x.oclIsInvalid() then invalid else x endif" idiom should silence warnings for the computation of x.

The primary diagnosis of invalid is a failure of validateUnconditionallyValid. However we may hit other problems too such as an inferred OclInvalid type. The latter should be avoided by explicitly declaring a super type, possibly OclAny for the invalid value.
Comment 18 Ed Willink CLA 2021-11-27 02:47:07 EST
(In reply to Ed Willink from comment #17)
> Getting there.

null/validity and empty hyprttheses in place.

But also need type and more detailed content.

And as indicated by Bug 574431 comment#1 we also need precision hypotheses to authorize the conversion of Sequence{1,9,7} as Sequence(Integer) to Sequence(ecore::EInt).
Comment 19 Ed Willink CLA 2023-02-02 03:59:47 EST
(In reply to Ed Willink from comment #18)
> we also need precision hypotheses

Bug 581471 identifies that overflow should never occur during evaluation; just during save serialization. Detection of possible overflows is no longer something that must be detected for provable correctness, just something that can be advised.