Bug 203304 - Allow to format set of regions
Summary: Allow to format set of regions
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.3   Edit
Hardware: PC Windows XP
: P3 enhancement (vote)
Target Milestone: 3.4 M3   Edit
Assignee: Jerome Lanneluc CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 201063 208066
  Show dependency tree
 
Reported: 2007-09-13 12:17 EDT by Benno Baumgartner CLA
Modified: 2007-11-02 06:03 EDT (History)
6 users (show)

See Also:


Attachments
proposed fix (29.57 KB, patch)
2007-09-13 12:19 EDT, Benno Baumgartner CLA
no flags Details | Diff
New patch (27.82 KB, patch)
2007-09-13 14:25 EDT, Olivier Thomann CLA
no flags Details | Diff
proposed fix (29.11 KB, patch)
2007-09-14 04:58 EDT, Benno Baumgartner CLA
no flags Details | Diff
proposed fix (48.08 KB, patch)
2007-09-20 05:55 EDT, Benno Baumgartner CLA
no flags Details | Diff
proposed fix (48.20 KB, patch)
2007-09-20 09:26 EDT, Benno Baumgartner CLA
no flags Details | Diff
proposed fix (48.24 KB, patch)
2007-09-20 09:30 EDT, Benno Baumgartner CLA
no flags Details | Diff
Proposed fix (50.45 KB, patch)
2007-10-05 14:08 EDT, Olivier Thomann CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Benno Baumgartner CLA 2007-09-13 12:17:50 EDT
I20070911-0833

Fixing bug 201063 may lead in some situations to the requirement to format enormous amounts of regions (hundreds, maybe thousands). Add the moment each call to DefaultCodeFormatter.format(K_COMPILATION_UNIT, String, int, int, int, String) results in a light parse of the CU and in a complete visit of the cu with the CodeFormatterVisitor. This does not scale if a high amount of regions needs to be formatted.

The ability to pass a set of regions to format to the code formatter would solve the issue: The CU only needs to be parsed and visited once:

* @param regions the regions to format. 
*                Two disjunct regions in <code>regions</code> must not overlap.
*                Each region must be within <code>source</code>. 
*                There must be at least one region.
public abstract TextEdit format(int kind, String source, IRegion[] regions, int indentationLevel, String lineSeparator);

Unfortunately this is a blocker for bug 201063, I could not find a workaround.
Comment 1 Benno Baumgartner CLA 2007-09-13 12:19:59 EDT
Created attachment 78333 [details]
proposed fix

My knowledge about the code formatter is minimal, but it seams to be a trivial and natural enhancement.
Comment 2 Olivier Thomann CLA 2007-09-13 14:09:00 EDT
Most of the existing code formatter tests failed with this patch.
So this needs to be carefully reviewed. Might be an obvious problem.
Comment 3 Olivier Thomann CLA 2007-09-13 14:25:53 EDT
Created attachment 78366 [details]
New patch

This patch is passing all existing formatter tests.
One thing that needs to be improved if the regions management inside the scribe. Doing a linear search for each edit is definitely not good for performance.
We should at least use the fact that the regions are sorted.
Comment 4 Benno Baumgartner CLA 2007-09-14 04:58:09 EDT
Created attachment 78410 [details]
proposed fix

> Most of the existing code formatter tests failed with this patch.
> So this needs to be carefully reviewed. Might be an obvious problem.

Sorry for that, I did run the tests this time...

> We should at least use the fact that the regions are sorted.

Yep, did that. I think this is optimal under the assumption that there are usually much more edits then regions and that the edits are not sorted.
Comment 5 Dani Megert CLA 2007-09-17 11:14:22 EDT
Jerome, can we get this directly after M2? We'd like to self host / test bug 201063 after M2 is out. Thanks.
Comment 6 Jerome Lanneluc CLA 2007-09-18 01:51:25 EDT
We need tests for this new feature before we even consider looking at it.
Comment 7 Benno Baumgartner CLA 2007-09-20 05:55:32 EDT
Created attachment 78831 [details]
proposed fix

(In reply to comment #6)
> We need tests for this new feature before we even consider looking at it.

Sure, with pleasure;-) This does work for kinds K_EXPRESSION, K_STATEMENTS, K_CLASS_BODY_DECLARATIONS, K_COMPILATION_UNIT, K_UNKNOWN (if not comment region)

For comments see bug 204091
Comment 8 Jerome Lanneluc CLA 2007-09-20 09:07:48 EDT
(In reply to comment #7)
> Created an attachment (id=78831) [details]
> proposed fix
Sorry this was not explicitly written down, but you should try to follow the formatting style of the surrounding code when making a change (for example use a white space before the assignment operator).
I now have updated our FAQ with this rule: http://wiki.eclipse.org/JDT_Core_Committer_FAQ#How_should_I_format_my_code_.3F
Comment 9 Benno Baumgartner CLA 2007-09-20 09:26:59 EDT
Created attachment 78844 [details]
proposed fix

(In reply to comment #8)
> Sorry this was not explicitly written down, but you should try to follow the
> formatting style of the surrounding code when making a change (for example use
> a white space before the assignment operator).

OK, also using blocks. Btw: Once this bug is fixed contributers (or even jdt/core) can just enable 'format changed regions on save' and you will always get perfectly formatted patches, wouldn't that be cool;-)
Comment 10 Benno Baumgartner CLA 2007-09-20 09:30:40 EDT
Created attachment 78847 [details]
proposed fix

Missed some blocks.
Comment 11 Dani Megert CLA 2007-09-28 11:24:04 EDT
Can we get some feedback on the patch? Thanks.
Comment 12 Philipe Mulet CLA 2007-10-01 10:58:21 EDT
It is on Eric's list. 
Comment 13 Eric Jodet CLA 2007-10-01 11:33:46 EDT
(In reply to comment #12)
Correct: I'll do this ASAP this week and will keep you updated.
Comment 14 Eric Jodet CLA 2007-10-02 05:15:13 EDT
(In reply to comment #11)
I did not expect such big patch for a first formatter review( keeping in mind I'm new to the code formatter).
I'll let Olivier perform a more in depth review.
Comment 15 Olivier Thomann CLA 2007-10-04 15:00:14 EDT
Several comments:
1) I'd like the existing code not to be penalized by this new API. The actual patch converts all existing code to use the new API. This implies the creation of a lot of region objects that are not required in the case there is only one region (from offset to offset + length). One region is the most common case for existing APIs.
2) I don't like all the Assert.... calls. They slow down runtime.
3) The binary search should be used only if there is more than one region.

I can propose a new patch that will handle 2) and 3). I guess once 3) is handled, then the overhead for 1) is limited.

Any comment?
Comment 16 Benno Baumgartner CLA 2007-10-05 04:07:54 EDT
(In reply to comment #15)

Hi Oliver, thanks for the review.

> Several comments:
> 1) I'd like the existing code not to be penalized by this new API. The actual
> patch converts all existing code to use the new API. This implies the creation
> of a lot of region objects that are not required in the case there is only one
> region (from offset to offset + length). One region is the most common case for
> existing APIs.

Sure. I wanted to minimize the code copying and express that the old case is a special case of the new one.

> 2) I don't like all the Assert.... calls. They slow down runtime.

Sure, I just forgot to comment them out, I use them for testing.

> 3) The binary search should be used only if there is more than one region.

Ok

> I can propose a new patch that will handle 2) and 3). I guess once 3) is
> handled, then the overhead for 1) is limited.

Ok
Comment 17 Olivier Thomann CLA 2007-10-05 14:08:51 EDT
Created attachment 79823 [details]
Proposed fix

Benno,

Here is a new patch. Please let me know if it looks ok for you.
The last thing that worries me is the performance impact. It would be nice if we could get some profiling datas on a decent test case with and without the new API.
Could you collect some datas for this, please?
I'd like to estimate the potential slowdown if any.
Comment 18 Benno Baumgartner CLA 2007-10-08 08:16:10 EDT
(In reply to comment #17)
> Created an attachment (id=79823) [details]
> Proposed fix
> 
> Benno,
> 
> Here is a new patch. Please let me know if it looks ok for you.

Yes, the patch looks good

> The last thing that worries me is the performance impact. It would be nice if
> we could get some profiling datas on a decent test case with and without the
> new API.
> Could you collect some datas for this, please?
> I'd like to estimate the potential slowdown if any.
> 

You're not going to ask me to write a formatter performance test do you;-) Anyways, I've added one to the CleanUpPerfTest.

Hint that I don't have a dedicated performance test machine. Due to interfering processes the test results may not be very accurate:

Format JDT/UI HEAD with enabled comment formatting
      cold        warm (3 runs)
New:   46s        45s/55s/46s
Old:   50s        47s/51s/51s

CleanUpPerfTest#testCodeFormatCleanUp (formats JUnit without comment formatting)
      Elapsed Process
New:  967ms
Old:  970ms
Comment 19 Dani Megert CLA 2007-10-30 06:41:06 EDT
Ping.
Comment 20 Jerome Lanneluc CLA 2007-10-30 07:57:11 EDT
Olivier, what do you think of the performance numbers given by Benno ?
Comment 21 Olivier Thomann CLA 2007-10-30 08:45:56 EDT
Since perfs look ok, I don't see a problem of releasing this patch.
I would release the last patch I submitted though.
Comment 22 Jerome Lanneluc CLA 2007-10-31 07:28:37 EDT
During the test pass, JDT/Text found that format on save can take minutes without this functionality. So this needs to be in for 3.4M3. I will take care of releasing Olivier's patch.
Comment 23 Jerome Lanneluc CLA 2007-10-31 08:05:28 EDT
Why are FormatterRegressionTests#test675-677 disabled ?
Comment 24 Jerome Lanneluc CLA 2007-10-31 08:22:47 EDT
(In reply to comment #23)
> Why are FormatterRegressionTests#test675-677 disabled ?
Sorry I wasn't clear. This question would be for Benno.

Comment 25 Jerome Lanneluc CLA 2007-10-31 08:49:10 EDT
Released Olivier's path (comment 17) as is.
Released for 3.4M3.
Comment 26 Benno Baumgartner CLA 2007-10-31 10:00:49 EDT
(In reply to comment #23)
> Why are FormatterRegressionTests#test675-677 disabled ?

See comment #7

(In reply to comment #25)
> Released Olivier's path (comment 17) as is.
> Released for 3.4M3.

Thank you!
Comment 27 Jerome Lanneluc CLA 2007-10-31 10:17:50 EDT
(In reply to comment #26)
> (In reply to comment #23)
> > Why are FormatterRegressionTests#test675-677 disabled ?
> 
> See comment #7
> 
Talking with Benno, it appears that the tests are disabled because they fail due to bug 204091.
Comment 28 Jerome Lanneluc CLA 2007-11-02 06:03:14 EDT
Verified for 3.4M3 using I20071101-2000