Bug 419052

Summary: Selecting the menu Source / Format in a large source file takes forever
Product: [Tools] CDT Reporter: Serge Beauchamp <sergebeauchamp>
Component: cdt-coreAssignee: Project Inbox <cdt-core-inbox>
Status: NEW --- QA Contact: Jonah Graham <jonah>
Severity: normal    
Priority: P3 CC: eclipse.sprigogin
Version: NextKeywords: helpwanted
Target Milestone: ---   
Hardware: PC   
OS: Windows 8   
Whiteboard:
Attachments:
Description Flags
Illustration of a fix for this feature none

Description Serge Beauchamp CLA 2013-10-09 11:56:25 EDT
Selecting the menu Source / Format in a large source file takes forever.

Opening a large source file in the editor causes the CDT "Editor Scalability" dialog to be displayed.

Unfortunately, invoking the menu "Source / Format" in that source will takes seemingly infinite amount of time even though the editor scalability mode has been turned on for that file.

The menu command should instead be disabled for CEditor which have the scalability mode turned on.
Comment 1 Serge Beauchamp CLA 2013-10-09 12:06:13 EDT
Created attachment 236297 [details]
Illustration of a fix for this feature

A fix for this bug consists of making the 'Source / Format' menu for C/C++ editor enabled only when the editor scalability mode if off, and the new 'Disable format command in editor' setting is set to true (the default).
Comment 2 Serge Beauchamp CLA 2013-10-09 12:44:17 EDT
Fix pushed through gerrit:

https://git.eclipse.org/r/17226
Comment 3 Sergey Prigogin CLA 2013-10-09 13:00:23 EDT
(In reply to Serge Beauchamp from comment #1)

It makes sense to figure out where the slowness is coming from in this particular example. Formatting should be a linear operation with respect to the size of the file. From the bug description it looks like there is a superlinear behavior that should be fixed.
Comment 4 Serge Beauchamp CLA 2013-10-09 15:09:04 EDT
Yes, you are right, the ultimate problem is the limited scaling and performance of the CDT parsing tools.

This is why there's the scalability feature to begin with, and I don't think that the format feature differs from the other CDT features that are disabled as part of the CDT.  

I'll run a test with a profile to confirm this presumption.
Comment 5 Serge Beauchamp CLA 2013-10-09 16:23:20 EDT
Here's a profile report from formatting the large source file included in the test:

100.0% - 119 s org.eclipse.cdt.internal.formatter.CodeFormatterVisitor.formatList
56.7% - 67,633 ms org.eclipse.cdt.internal.core.dom.parser.cpp.CPPASTLiteralExpression.accept
56.7% - 67,549 ms org.eclipse.cdt.internal.formatter.CodeFormatterVisitor.visit
41.4% - 49,301 ms org.eclipse.cdt.internal.formatter.CodeFormatterVisitor.visit
22.6% - 26,905 ms org.eclipse.cdt.internal.formatter.Scribe.printNextToken
22.4% - 26,718 ms org.eclipse.cdt.internal.formatter.Scribe.printNextToken
...
18.7% - 22,328 ms org.eclipse.cdt.internal.formatter.CodeFormatterVisitor.peekNextToken

I don't think there's anything really wrong about the code, performance wise.

I verified that it scales exactly in a linear fashion, and that having a table with twice the amount of elements causes 'printNextToken' to be called twice as often, plus the initial amount of the surrounding code.

It's just that it's not efficient to create lots of objects for a simple data table: "{ 0,0,0,0,0}", when the file contains tens of thousands of elements.

The simple scanner could use a DFA, but I don't think it would make the code an order of magnitude faster.
Comment 6 Sergey Prigogin CLA 2013-10-14 16:52:17 EDT
(In reply to Serge Beauchamp from comment #5)

Would changing the behavior of Format without selection as described in bug 419399 be sufficient to prevent the situation described in this bug?
Comment 7 Serge Beauchamp CLA 2013-10-16 09:08:39 EDT
(In reply to Sergey Prigogin from comment #6)
> (In reply to Serge Beauchamp from comment #5)
> 
> Would changing the behavior of Format without selection as described in bug
> 419399 be sufficient to prevent the situation described in this bug?

I would certainly alleviate the usability problem from the user's perspective, yes.