497931 – Unresolved symbol due to failed ambiguity resolution

Bug 497931 - Unresolved symbol due to failed ambiguity resolution

Summary: Unresolved symbol due to failed ambiguity resolution

Status:	NEW

Alias:	None

Product:	CDT
Classification:	Tools
Component:	cdt-parser (show other bugs)
Version:	9.0.0
Hardware:	All All

Importance:	P3 normal (vote)
Target Milestone:	---
Assignee:	Project Inbox
QA Contact:	Jonah Graham

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2016-07-14 14:58 EDT by Sergey Prigogin
Modified:	2020-09-04 15:23 EDT (History)
CC List:	2 users (show)

See Also:	Gerrit Change Git Commit 519975

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Sergey Prigogin

2016-07-14 14:58:44 EDT

template <int I, typename T>
int waldo(T v) {
  return v < I ? 1 : 1 + waldo<I, T>(v / I); // problem on waldo
}

Comment 1 Sergey Prigogin

2016-07-14 18:39:30 EDT

The return statement gets parsed as:

IASTReturnStatement
  ICPPASTFunctionCallExpression
    IASTIdExpression
      ICPPASTTemplateId: v<I ? 1 : 1 + waldo<I, T>
  IASTBinaryExpression: v / I

Comment 2 Sergey Prigogin

2016-07-14 20:30:07 EDT

It's highly surprising but it seems that the parser doesn't even consider a possibility that "v < I" could be a binary expression.

Comment 3 Sergey Prigogin

2016-07-14 22:46:43 EDT

It looks like the optimization in TemplateIdStrategy.java line 123 is invalid. The code gets parsed correctly when the optimization is disabled. Here is what the comment for the optimization says:

"Optimization (bug 363609): if during the previous alternative, a name was parsed as a template-id with multiple template arguments, it's not going to be parsed differently in a subsequent alternative, so keep it as a template-id.
Of course, this optimization is only sound if the previous alternative was parsed successfully (bug 445177)."

I'm out of ideas on possible changes that would not break testNestedTemplateAmbiguity_363609().

Comment 4 Nathan Ridge

2016-07-15 00:51:16 EDT

I had a conversation with Richard Smith last year where he explained to me how clang disambiguates between the two uses of '<'. Basically, it's able to perform name lookup on the name preceding the '<', and see if it resolves to the name of a template or not.

I haven't spent much time thinking about how easy this would be to implement in CDT, but it's a possible avenue to explore. It certainly seems conceptually simpler and more efficient than what we're doing.

Comment 5 Eclipse Genie

2016-07-15 14:18:27 EDT

New Gerrit change created: https://git.eclipse.org/r/77411

Comment 6 Eclipse Genie

2016-07-15 18:29:55 EDT

Gerrit change https://git.eclipse.org/r/77411 was merged to [master].
Commit: http://git.eclipse.org/c/cdt/org.eclipse.cdt.git/commit/?id=2d4d6b3e1cbd857563e748b5acd7b643e35ef448

Comment 7 Sergey Prigogin

2016-07-15 18:30:40 EDT

(In reply to Nathan Ridge from comment #4)

Agree.

Comment 8 Nathan Ridge

2017-11-29 23:54:30 EST

(In reply to Nathan Ridge from comment #4)
> I had a conversation with Richard Smith last year where he explained to me
> how clang disambiguates between the two uses of '<'. Basically, it's able to
> perform name lookup on the name preceding the '<', and see if it resolves to
> the name of a template or not.
> 
> I haven't spent much time thinking about how easy this would be to implement
> in CDT, but it's a possible avenue to explore. It certainly seems
> conceptually simpler and more efficient than what we're doing.

I thought a bit about how this could be implemented in CDT, and unfortunately it does not seem like it would be straightforward at all.

Basically, name resolution can require arbitrary semantic analysis to be performed. To get an idea for the scope of the problem, consider this code:

  constexpr int factorial(int n) { ... }

  int f();

  template <int>
  class A {
    template <int> class Waldo { 
      static void f();
    };
  };

  template <>
  class A<120> {
    static int Waldo;
  };

  int main() {
    A<factorial(5)>::Waldo<0>::f();
  }

Here, the expression in the expression-statement in main() can be parsed as a comparsion expression:

  ((A<factorial(5)>::Waldo) < 0) > (::f())

or as a function call expression:

  (A<factorial(5)>::Waldo<0>::f) ()

To make the correct choice "as we parse", we need to be able to tell whether "A<factorial(5)>::Waldo" names a template or not. As it happens, A<N>::Waldo can be a template or not depending on the value of N, so we need to evaluate N, which in this case requires performing constant expression evaluation. It's not difficult to imagine how just about any other semantic analysis task (e.g. overload resolution) could similarly be a prerequisite for deciding whether a name is a template-name.

I believe this is how compilers like clang handle this - they perform all semantic analysis "as they parse", so results of the semantic analysis can inform parsing of subsequent tokens.

Unfortunately, CDT does not work like this. CDT's semantic analysis facilities assume that they have a structurally complete AST to operate on. That is why we have a clear separation between the initial parse, which produces an AST that may have some ambiguous nodes but is structurally complete, and an ambiguity resolution phase which performs enough semantic analysis to resolve the ambiguities. (Full semantic analysis is only performed when a consumer of the AST, like the indexer or the semantic highlighter, needs it.)

I would very much like for CDT's parsing and semantic analysis to work more like clang's; the fact that it doesn't has caused a lot of headaches (this bug being an example). But making it work like that would amount to a very significant rewrite.

Seeing as we are exploring using clang itself to provide CDT's semantics-aware capabilities via the LSP, I think it would be more sensible to just focus on that, rather than to spend time on a what would effectively be a rewrite of CDT's own parser.