Bug 497931 - Unresolved symbol due to failed ambiguity resolution
Summary: Unresolved symbol due to failed ambiguity resolution
Status: NEW
Alias: None
Product: CDT
Classification: Tools
Component: cdt-parser (show other bugs)
Version: 9.0.0   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact: Jonah Graham CLA
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-07-14 14:58 EDT by Sergey Prigogin CLA
Modified: 2020-09-04 15:23 EDT (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sergey Prigogin CLA 2016-07-14 14:58:44 EDT
template <int I, typename T>
int waldo(T v) {
  return v < I ? 1 : 1 + waldo<I, T>(v / I); // problem on waldo
}
Comment 1 Sergey Prigogin CLA 2016-07-14 18:39:30 EDT
The return statement gets parsed as:

IASTReturnStatement
  ICPPASTFunctionCallExpression
    IASTIdExpression
      ICPPASTTemplateId: v<I ? 1 : 1 + waldo<I, T>
  IASTBinaryExpression: v / I
Comment 2 Sergey Prigogin CLA 2016-07-14 20:30:07 EDT
It's highly surprising but it seems that the parser doesn't even consider a possibility that "v < I" could be a binary expression.
Comment 3 Sergey Prigogin CLA 2016-07-14 22:46:43 EDT
It looks like the optimization in TemplateIdStrategy.java line 123 is invalid. The code gets parsed correctly when the optimization is disabled. Here is what the comment for the optimization says:

"Optimization (bug 363609): if during the previous alternative, a name was parsed as a template-id with multiple template arguments, it's not going to be parsed differently in a subsequent alternative, so keep it as a template-id.
Of course, this optimization is only sound if the previous alternative was parsed successfully (bug 445177)."

I'm out of ideas on possible changes that would not break testNestedTemplateAmbiguity_363609().
Comment 4 Nathan Ridge CLA 2016-07-15 00:51:16 EDT
I had a conversation with Richard Smith last year where he explained to me how clang disambiguates between the two uses of '<'. Basically, it's able to perform name lookup on the name preceding the '<', and see if it resolves to the name of a template or not.

I haven't spent much time thinking about how easy this would be to implement in CDT, but it's a possible avenue to explore. It certainly seems conceptually simpler and more efficient than what we're doing.
Comment 5 Eclipse Genie CLA 2016-07-15 14:18:27 EDT
New Gerrit change created: https://git.eclipse.org/r/77411
Comment 7 Sergey Prigogin CLA 2016-07-15 18:30:40 EDT
(In reply to Nathan Ridge from comment #4)

Agree.
Comment 8 Nathan Ridge CLA 2017-11-29 23:54:30 EST
(In reply to Nathan Ridge from comment #4)
> I had a conversation with Richard Smith last year where he explained to me
> how clang disambiguates between the two uses of '<'. Basically, it's able to
> perform name lookup on the name preceding the '<', and see if it resolves to
> the name of a template or not.
> 
> I haven't spent much time thinking about how easy this would be to implement
> in CDT, but it's a possible avenue to explore. It certainly seems
> conceptually simpler and more efficient than what we're doing.

I thought a bit about how this could be implemented in CDT, and unfortunately it does not seem like it would be straightforward at all.

Basically, name resolution can require arbitrary semantic analysis to be performed. To get an idea for the scope of the problem, consider this code:

  constexpr int factorial(int n) { ... }

  int f();

  template <int>
  class A {
    template <int> class Waldo { 
      static void f();
    };
  };

  template <>
  class A<120> {
    static int Waldo;
  };

  int main() {
    A<factorial(5)>::Waldo<0>::f();
  }

Here, the expression in the expression-statement in main() can be parsed as a comparsion expression:

  ((A<factorial(5)>::Waldo) < 0) > (::f())

or as a function call expression:

  (A<factorial(5)>::Waldo<0>::f) ()

To make the correct choice "as we parse", we need to be able to tell whether "A<factorial(5)>::Waldo" names a template or not. As it happens, A<N>::Waldo can be a template or not depending on the value of N, so we need to evaluate N, which in this case requires performing constant expression evaluation. It's not difficult to imagine how just about any other semantic analysis task (e.g. overload resolution) could similarly be a prerequisite for deciding whether a name is a template-name.

I believe this is how compilers like clang handle this - they perform all semantic analysis "as they parse", so results of the semantic analysis can inform parsing of subsequent tokens.

Unfortunately, CDT does not work like this. CDT's semantic analysis facilities assume that they have a structurally complete AST to operate on. That is why we have a clear separation between the initial parse, which produces an AST that may have some ambiguous nodes but is structurally complete, and an ambiguity resolution phase which performs enough semantic analysis to resolve the ambiguities. (Full semantic analysis is only performed when a consumer of the AST, like the indexer or the semantic highlighter, needs it.)

I would very much like for CDT's parsing and semantic analysis to work more like clang's; the fact that it doesn't has caused a lot of headaches (this bug being an example). But making it work like that would amount to a very significant rewrite.

Seeing as we are exploring using clang itself to provide CDT's semantics-aware capabilities via the LSP, I think it would be more sensible to just focus on that, rather than to spend time on a what would effectively be a rewrite of CDT's own parser.