Bug 430136 - [1.8][model] JavaElement.JEM_LAMBDA_METHOD should not use '*' and '>' as memento delimiter
Summary: [1.8][model] JavaElement.JEM_LAMBDA_METHOD should not use '*' and '>' as meme...
Status: RESOLVED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 4.4   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: BETA J8   Edit
Assignee: Jay Arthanareeswaran CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-03-11 15:14 EDT by Markus Keller CLA
Modified: 2014-03-13 16:26 EDT (History)
1 user (show)

See Also:
srikanth_sankaran: review+


Attachments
Proposed fix (5.80 KB, patch)
2014-03-12 06:02 EDT, Jay Arthanareeswaran CLA
no flags Details | Diff
Fix to keep existing element handles stable (5.35 KB, patch)
2014-03-12 14:45 EDT, Markus Keller CLA
no flags Details | Diff
Fix to escape new delimiters (7.73 KB, patch)
2014-03-13 09:49 EDT, Jay Arthanareeswaran CLA
no flags Details | Diff
Updated patch (7.76 KB, patch)
2014-03-13 12:41 EDT, Jay Arthanareeswaran CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Markus Keller CLA 2014-03-11 15:14:52 EDT
JavaElement.JEM_LAMBDA_METHOD should not use '*' and '>' as memento delimiter.

I just saw that the fix for bug 425134 had to make adjustments in MementoTests. That's not good. Existing JavaElement handles should stay as stable as possible, see Javadoc of IJavaElement#getHandleIdentifier(). '*' and '>' were already used by existing mementos, and such stored mementos cannot be parsed any more now.

I see that most "safe" ASCII characters are already used as token delimiters or in token data, but these look free:

&')`

Forbidden character groups:
- identifier characters
- characters used in Signature (at least for parameter type signatures)
- , is IMO risky as well (often used by clients to separate data)

We should also have a test for lambda type and lambda method mementos. I don't think these will be stored often (since they are inherently unstable), but they can be used locally (e.g. to put an element on the clipboard, or for hyperlinks in Javadoc hovers).
Comment 1 Srikanth Sankaran CLA 2014-03-11 15:25:47 EDT
Jay, thanks for following up.
Comment 2 Jay Arthanareeswaran CLA 2014-03-12 06:02:02 EDT
Created attachment 240796 [details]
Proposed fix

Patch with new tests to test memento and creating handles back from the memento.

Srikanth, please review.
Comment 3 Srikanth Sankaran CLA 2014-03-12 06:25:18 EDT
Looks good, thanks Jay.
Comment 4 Jay Arthanareeswaran CLA 2014-03-12 12:30:41 EDT
Thanks Srikanth. But looks like ) is taken too. We allow ) as part of project names :(
Comment 5 Jay Arthanareeswaran CLA 2014-03-12 12:39:48 EDT
(In reply to Jayaprakash Arthanareeswaran from comment #4)
> Thanks Srikanth. But looks like ) is taken too. We allow ) as part of
> project names :(

I guess we will alter the tests and move on. Srikanth, what do you think?
Comment 6 Srikanth Sankaran CLA 2014-03-12 12:41:59 EDT
Which test are you talking about ? Basically as Markus pointed out from
the javadocs, the memento can span sessions and we have limited room in
what we can do.

It is enough if we identity ONE kosher delimiter - we can alter the code so
the new delimiter is really a delimiter sequence -
Comment 7 Jay Arthanareeswaran CLA 2014-03-12 12:53:39 EDT
Only the tests needed to be adjusted with right escapes. I have pushed the fix here:

http://git.eclipse.org/c/jdt/eclipse.jdt.core.git/commit/?h=BETA_JAVA8&id=a80fa7c6cb262efaa08135896e90e4ea5ac09d8b
Comment 8 Srikanth Sankaran CLA 2014-03-12 13:13:03 EDT
(In reply to Jayaprakash Arthanareeswaran from comment #7)
> Only the tests needed to be adjusted with right escapes. I have pushed the
> fix here:
> 
> http://git.eclipse.org/c/jdt/eclipse.jdt.core.git/commit/
> ?h=BETA_JAVA8&id=a80fa7c6cb262efaa08135896e90e4ea5ac09d8b

For the record, first two are reverting the changes made earlier which
prompted the opening of this CR - So good.

The last one - what happens if a stored memento contains "=P \\(abc) \\~" ?

Looking at this bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=47815,
I don't know that we (can) have a satisfactory solution impervious to breakage,
given escaping happens only for known delimiters at *generation* time.

Avoiding clashes with what we know are likely encodings due to our own
internal schemes is in our hands but for project names and such encoding 
special characters - things are not in our control at this point and we 
will have to settle for what we have now.
Comment 9 Markus Keller CLA 2014-03-12 14:31:43 EDT
The new memento tests for lambdas should also go into MementoTests, so that all memento tests are at the same place. The sysouts in ResolveTests18#test430136() should be removed.

The situation is still not good, since comment 7 breaks bug 47815. Try creating a Java working set and put an IType from a project called "(hello)" in there. I'm pretty sure this won't survive a migration from 4.3 to BETA_JAVA8. (IIRC, projects are stored as resources in Java working sets, so it may work for the project alone).


I think the only way out is to revert JavaElement#escapeMementoName() to the 4.3 state, and then make MementoTokenizer#nextToken() smarter: It should only consider token delimiters that can actually show up as delimiters after the already seen delimiters.

Maybe we can get away with only escaping the new token delimiters after we've seen a type root (CU or class file), since only JEM_JAVAPROJECT, JEM_PACKAGEFRAGMENTROOT, JEM_COMPILATIONUNIT, and
JEM_CLASSFILE accept non-Java-identifier characters as names.
Comment 10 Srikanth Sankaran CLA 2014-03-12 14:43:19 EDT
(In reply to Markus Keller from comment #9)
> The new memento tests for lambdas should also go into MementoTests, so that
> all memento tests are at the same place. The sysouts in
> ResolveTests18#test430136() should be removed.

This has been removed already here:

http://git.eclipse.org/c/jdt/eclipse.jdt.core.git/commit/?h=BETA_JAVA8&id=4ba46e68413e894bf4c2e4db03e6bebaa1c62c18

> The situation is still not good, since comment 7 breaks bug 47815. Try
> creating a Java working set and put an IType from a project called "(hello)"
> in there. I'm pretty sure this won't survive a migration from 4.3 to
> BETA_JAVA8. (IIRC, projects are stored as resources in Java working sets, so
> it may work for the project alone).

Markus, why is this an issue only with the new memento demiliters ? At every
stage a new delimiter got introduced that could have been used in other
contexts the same situation would have prevailed ?

> I think the only way out is to revert JavaElement#escapeMementoName() to the
> 4.3 state, and then make MementoTokenizer#nextToken() smarter: It should
> only consider token delimiters that can actually show up as delimiters after
> the already seen delimiters.

This is too much work.

A simpler solution is to treat one of the existing delimiters as a start
of an delimiter sequence - if it is not immediately followed by another 
delimiter, it defaults to old meaning vs new meaning if a sequence is seen.
Do you agree that scheme would work ?
Comment 11 Markus Keller CLA 2014-03-12 14:45:38 EDT
Created attachment 240827 [details]
Fix to keep existing element handles stable
Comment 12 Srikanth Sankaran CLA 2014-03-12 14:46:56 EDT
(In reply to Markus Keller from comment #9)
> The new memento tests for lambdas should also go into MementoTests, so that
> all memento tests are at the same place. 

While this is an ideal goal - we discussed this and we don't want to enforce
it in this case - the requirements for the test call for various hierarchy
elements that are not visible readily - JDT/Core normal commit rules call
for all tests to be run anyways.
Comment 13 Srikanth Sankaran CLA 2014-03-12 14:47:38 EDT
(In reply to Markus Keller from comment #11)
> Created attachment 240827 [details]
> Fix to keep existing element handles stable

would the proposal in comment#10 be simpler ?
Comment 14 Jay Arthanareeswaran CLA 2014-03-12 14:52:09 EDT
(In reply to Markus Keller from comment #9)
> The situation is still not good, since comment 7 breaks bug 47815. Try
> creating a Java working set and put an IType from a project called "(hello)"
> in there. I'm pretty sure this won't survive a migration from 4.3 to
> BETA_JAVA8. (IIRC, projects are stored as resources in Java working sets, so
> it may work for the project alone).

I can't reproduce this. I am able to add types from a project named "(Test[])" to a java working set. I also tried out the copy+paste testcase from bug 47815 without any problem. But I agree with you on the migration part.
Comment 15 Markus Keller CLA 2014-03-12 15:15:55 EDT
(In reply to Srikanth Sankaran from comment #10)
> Markus, why is this an issue only with the new memento demiliters ? At every
> stage a new delimiter got introduced that could have been used in other
> contexts the same situation would have prevailed ?

The last new delimiter was '}', added in 2007 via bug 79112. That one was also unsafe, but I guess project names contain ')' much more frequently than '}'.

All other handle format changes were done before bug 79112, in complete ignorance of this problem.

> A simpler solution is to treat one of the existing delimiters as a start
> of an delimiter sequence - if it is not immediately followed by another 
> delimiter, it defaults to old meaning vs new meaning if a sequence is seen.
> Do you agree that scheme would work ?

You mean e.g. '=' is considered an escape for new delimiters, so that JEM_LAMBDA_EXPRESSION would be written as '=)'? Yes, that could work and could be an easier solution than comment 11.

A project with name '(hi=)' would then still have handle '=\(hi\=)' because '=' is escaped as usual and ')' has no special meaning since it doesn't follow an unescaped '='.

(I won't spend time to implement/test this.)
Comment 16 Srikanth Sankaran CLA 2014-03-12 15:23:04 EDT
(In reply to Markus Keller from comment #15)

> You mean e.g. '=' is considered an escape for new delimiters, so that
> JEM_LAMBDA_EXPRESSION would be written as '=)'? Yes, that could work and
> could be an easier solution than comment 11.

Jay, could you prototype this please, Thanks in advance.
Comment 17 Srikanth Sankaran CLA 2014-03-13 00:58:18 EDT
(In reply to Markus Keller from comment #15)

> A project with name '(hi=)' would then still have handle '=\(hi\=)' because
> '=' is escaped as usual and ')' has no special meaning since it doesn't
> follow an unescaped '='.
> 
> (I won't spend time to implement/test this.)

I also wouldn't recommend it.
Comment 18 Jay Arthanareeswaran CLA 2014-03-13 09:49:24 EDT
Created attachment 240862 [details]
Fix to escape new delimiters

Fix as discussed. Even though we don't seem to allow " as part of many element names, just to be safe, I am treating that delimiter as well.

Have run the relevant tests and looks fine. But will run all tests and report back.
Comment 19 Jay Arthanareeswaran CLA 2014-03-13 12:41:24 EDT
Created attachment 240870 [details]
Updated patch

Updated patch after a pull. Nothing's changed since last.
Comment 20 Markus Keller CLA 2014-03-13 12:43:49 EDT
Please explain the problem and the solution in the code. The story is so hard to understand that this needs to be documented for readers of the code.

"appendLambdaDelimiter" and "JEM_LAMBDA_ESCAPE" are not good names. This is not about lambdas, but about escaping all future delimiters. E.g. use appendEscapedDelimiter and JEM_DELIMITER_ESCAPE.

MementoTests#testProjectMemento2() needs to be reverted to this:

public void testProjectMemento2() {
	IJavaProject project = getJavaProject("P (abc) ~");
	assertMemento(
		"=P \\(abc\\) \\~",
		project);
}

You probably forgot to remove the new token delimiters from JavaElement#escapeMementoName(StringBuffer, String)

Add a comment to that method, so that the next one sees that this list must not be extended again.
Comment 21 Srikanth Sankaran CLA 2014-03-13 13:02:37 EDT
These cases directly under the switch in MementoTokenizer.nextToken()
should not stay ? 

                        case JavaElement.JEM_LAMBDA_EXPRESSION:
				return LAMBDA_EXPRESSION;
			case JavaElement.JEM_LAMBDA_METHOD:
				return LAMBDA_METHOD;
			case JavaElement.JEM_STRING:
				return STRING;

Likewise these cases in the next switch in the same method should not stay :

                                case JavaElement.JEM_LAMBDA_EXPRESSION:
				case JavaElement.JEM_LAMBDA_METHOD:	
				case JavaElement.JEM_STRING:

Otherwise patch looks good.
Comment 22 Markus Keller CLA 2014-03-13 13:16:56 EDT
(In reply to Markus Keller from comment #20)
> MementoTests#testProjectMemento2() needs to be reverted to this:
> 
> public void testProjectMemento2() {
> 	IJavaProject project = getJavaProject("P (abc) ~");
> 	assertMemento(
> 		"=P \\(abc\\) \\~",
> 		project);
> }

Sorry, copied wrong side. The correct line is the old one with unescaped ')':

		"=P \\(abc) \\~",

To protect us from future fallout, please add a testProjectMemento4() that has a project name with all old JEM_* characters that are possible in a project name plus &')`,
Comment 23 Srikanth Sankaran CLA 2014-03-13 13:43:51 EDT
Let us first finalize this and release this before working on https://bugs.eclipse.org/bugs/show_bug.cgi?id=430307
Comment 24 Jay Arthanareeswaran CLA 2014-03-13 16:26:16 EDT
Thanks Markus and Srikanth. Fixed with changes here:

http://git.eclipse.org/c/jdt/eclipse.jdt.core.git/commit/?h=BETA_JAVA8&id=8c2daa60afb7b30e300741be54521542b6ad58f3

> To protect us from future fallout, please add a testProjectMemento4() that
> has a project name with all old JEM_* characters that are possible in a
> project name plus &')`,

This will have to wait. Will try to squeeze in as part of the other bugs in this area, perhaps 430307.