Bug 68506 - Java code formatter strips newline, leaving final line unterminated!
Summary: Java code formatter strips newline, leaving final line unterminated!
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.0   Edit
Hardware: PC Windows 2000
: P3 normal with 1 vote (vote)
Target Milestone: 3.0.2   Edit
Assignee: Olivier Thomann CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 69129 72069 72698 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-06-24 13:32 EDT by Deven T. Corzine CLA
Modified: 2005-03-12 15:08 EST (History)
6 users (show)

See Also:


Attachments
Formatter profile (18.59 KB, text/xml)
2004-06-30 13:53 EDT, Deven T. Corzine CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Deven T. Corzine CLA 2004-06-24 13:32:46 EDT
I upgraded from around 3.0M6 or 3.0M7 to 3.0RC2 and noticed a new bug -- the
Java code formatter is now eating the final newline in the file, leaving a final
brace as the last line of the file with NO newline at the end of the file.  (I
can upload a copy of my formatter settings if you have any trouble reproducing
this.)
Comment 1 Deven T. Corzine CLA 2004-06-24 13:33:34 EDT
I forgot to mention, I upgraded to 3.0RC3 today -- the bug is still there.
Comment 2 Olivier Thomann CLA 2004-06-29 21:45:48 EDT
This is on purpose. All whitespaces that are not used for indentation purpose
are removed. All other whitespaces that are not explicitely required are
optional through options.
We might add an option to add blank lines at the end of the compilation unit.
Comment 3 Deven T. Corzine CLA 2004-06-30 13:07:52 EDT
I'm not talking about BLANK lines at the end of the compilation unit.  I'm
talking about UNTERMINATED lines.  The last line contains a closing brace and no
newline, so the file now ends IN THE MIDDLE OF A LINE.  Every time I terminate
the line again, the formatter strips it out again.

Is this REALLY what was intended?  Previous versions did NOT have this bug.  If
I format in 3.0M6, it will FIX the corruption caused by 3.0RC3, restoring the
missing newline to properly terminate the last line of the file.  This was the
correct behavior -- I assume the current behavior is an oversight.

Again, this has nothing to do with blank lines.  The corruption is very obvious
if you open a text console and print the file to the screen -- the closing brace
and the next console prompt print on the same line because of the missing
newline on that final line...
Comment 4 Olivier Thomann CLA 2004-06-30 13:14:43 EDT
Relax! I misunderstood what you said. Could you please attach your test case and
the result you are getting + export your code formatter preferences?
I will have a look. This would be a good candidate for a 3.0.1 update.
Comment 5 Deven T. Corzine CLA 2004-06-30 13:51:56 EDT
Sorry, I was just flabbergasted that this could be described as intentional! :)

A trivial test case suffices:

public class test
{
}

Start with an uncorrupted file (as New->Class will create), where you can put
the cursor on the blank line below the closing brace.  Run the formatter, and
the last line is now unterminated.  With the older code, it would leave that
final newline alone, and even put it back in if it's missing.

I'll upload my formatter profile; it's probably specific to the options I have
selected.

Why 3.0.1?  Was 3.0 final frozen already?
Comment 6 Deven T. Corzine CLA 2004-06-30 13:53:04 EDT
Created attachment 12922 [details]
Formatter profile
Comment 7 Olivier Thomann CLA 2004-07-04 13:03:47 EDT
ok, I checked what we are doing. We remove the blank line at the end of the
compilation unit. This makes sense. The formatter removes all whitespaces that
are not required. The last line is not required.
Why do you want this empty line at the end of the compilation unit?

Note that if you have another class defined after the first one, you can already
set a new line between these two classes.

The only way to add this extra line would be to add a new option.
Comment 8 Trevor Robinson CLA 2004-07-04 13:50:48 EDT
The problem is that you're thinking of the "newline" as a line separator and not
a line terminator. All we're asking is that the last line (containing the "}"
for the last class) be terminated, primarily because there are other tools out
there that expect it. Most editors behave as if newline were a separator, and
let you put the cursor on "line" after the last newline, but other tools treat
it as a terminator. For instance, GNU diff prints "No newline at end of file" if
the newline is missing. TextPad, my favorite Windows text editor, has a specific
configuration option, "Automatically terminate the last line of the file". This
may be a situation that requires a special case in your code, though I don't
think it necessarily requires an new configuration option, as terminating the
last line has always been standard practice.
Comment 9 Olivier Thomann CLA 2004-07-04 15:46:55 EDT
This would have to be a new option since it is changing the default behavior of
the code formatter.
I will check the Java coding standard to see if this is a bug when the default
formatter settings are used.
Comment 10 Deven T. Corzine CLA 2004-07-07 13:04:47 EDT
In the general case, text files ending with an unterminated line are usually
viewed as corrupted, and text editors will typically repair such corruption
automatically (and often silently).  Eclipse should probably do the same with
files _known_ to be text files, such as Java source files.

With regard to Java syntax in particular:

     http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html

Section 3.4 (Line Terminators) of the Java Language Specification states that
"Implementations next divide the sequence of Unicode input characters into lines
by recognizing line terminators."  Note that it refers to line TERMINATORS, not
line SEPARATORS.  This strongly implies that ALL lines in the file should be
terminated, as does the grammar production listed later for a single-line
comment, which EXPECTS the LineTerminator at the end of the comment.  While the
standard seems to be a bit ambiguous about whether a compilation unit ending in
a closing brace without a line terminator is technically acceptable, it's quite
clearly not what the language designers intended.

Obviously, the final line in a Java compilation unit SHOULD be terminated with a
newline, whether or not any given Java compiler accepts an unterminated line. 
This is the correct behavior for dealing with line terminators, and for text
files in general.  This is also the historical behavior of the code formatter in
Eclipse, and there was no good reason for this behavior to change.  Previous
versions of eclipse (including the old formatter in 2.1 and the new formatter in
3.0M6) did NOT strip the final newline as the 3.0RC2 and 3.0 versions do.  This
is a bug that was recently introduced, and should be fixed.  Unless someone is
REQUESTING this broken behavior (which is highly doubtful), there's really no
reason to create a configuration option for this; it should just be fixed.

Note that nobody is arguing against stripping trailing BLANK lines here.  That
seems like a perfectly reasonable and desirable thing to do, but the newline
which TERMINATES the last NON-EMPTY line must not be mistaken for a BLANK line,
as appears to be happening now.

Moreover, with my particular settings (which I uploaded earlier), I have just
discovered that if I add a BLANK line after the line with the closing brace,
that blank line will be PRESERVED even though it is at the end of the file. 
(All other trailing whitespace will be removed.)  So it appears that I can have
either TWO final newlines in the file (leaving a trailing blank line) or ZERO
(leaving an unterminated line), but the happy medium of ONE to terminate my
final line with a closing brace is not an option.

Also, if I end the file with a single-line comment (with or without a blank
line), the final newline terminating the comment will not be stripped -- but
neither will a missing newline be added to terminate the comment if necessary.

The logic involving whitespace at the end of the compilation unit should
probably be removed entirely, and replaced with new (and more robust) code which
executes LAST in the formatting algorithm, which replaces all trailing
whitespace in the file with a single newline.  (Since a Java compilation unit
cannot be an empty file, there is no need to check for that corner case.)

Although I'm sure the buffer isn't stored as a String, for illustrative purposes
I'll pretend it is, and offer this one-line implementation of the algorithm:

     file = file.replaceFirst("\s*\z", "\n");

I don't think there's a need to control this with a configuration option, but I
guess you could if anyone complains.  (Who would?)  Certainly, this should be
the default algorithm applied to Java code after all other formatting is done.
Comment 11 Olivier Thomann CLA 2004-07-07 16:41:25 EDT
Your statement "as does the grammar production listed later for a single-line
comment, which EXPECTS the LineTerminator at the end of the comment." is not
true anymore with the upcoming JLS3.
EndOfLineComment rule is now:
/ / CharactersInLineopt

There is no more LineTerminator behind.

I will check the behavior of different text editors with regard to the end of
the last line.
Comment 12 Deven T. Corzine CLA 2004-07-12 10:22:21 EDT
What have you found with regard to text editors?

In any event, even if the final newline isn't strictly required, what possible
value is there to removing it by default?  Most people do NOT want text files to
end in the middle of a line -- if you really want to support that in Eclipse
(which seems quite useless), make it an option that defaults OFF so that most
people don't end up with text files ending in the middle of a line...

Note that this is also a problem for people who do joint development in CVS --
if one person is using 2.1 (or even 3.0M6) and another is using 3.0, they'll
keep checking in changes to that final newline, generating unnecessary diffs. 
Eclipse never used to strip this final newline -- this is recent behavior.
Comment 13 Olivier Thomann CLA 2004-07-26 21:22:25 EDT
I checked with different text editors and none of them is adding a line
separator at the end of such source files.
We can add such an option to add this extra line separator. I will check if this
needs to be the default or not.
Comment 14 Jon Nall CLA 2004-07-27 08:53:52 EDT
i checked vim and it does indeed add a terminating newline if one is not
present. my test was as follows:

- create a file with a terminating newline
- edit the file in a hex editor to remove the newline
- open the file in vim
- immediately save and quit
- the resulting file has a terminating newline

is this similar to the procedure you used in testing editors?
Comment 15 Deven T. Corzine CLA 2004-08-03 15:12:38 EDT
I just did a little testing of my own, but only under Windows for the moment,
since that's what I'm running on my desktop at work right now.

It does appear to be more common for text editors to ignore the missing final
newline, but I did find a number of editors which WILL automatically terminate
the final line by adding the missing newline (in the default configuration):

Boxer
EDIT.COM
Elvis
GNU nano
Microsoft Word
NotesPad
Prolix
Vim 
Zeus

So that's nine different examples of editors which will automatically add a
final newline if it's missing.  So it's certainly not unheard of.  (And EDIT.COM
and Microsoft Word are both pretty standard programs.)

I'm sure I could find more examples under Unix/Linux if I were to try.

However, this isn't really the issue -- this is just to illustrate that it's not
unreasonable to assume that a text file should not end with a partial line.  I'm
not asking for Eclipse to automatically add the final newline whenever saving
from the text editor, though I think it would be good to have as an option. 
(For that, I'd default the option off, probably, since some text files might end
in a partial line for a good reason, or someone might open a binary file in the
text editor...)

I'm asking for the code formatter to return to its previous behavior of adding
that final newline when it's missing and NOT stripping it when it's there.  This
was the historical behavior in 2.1 and most of the 3.0 test releases -- only
recently did this behavior change.  If you want to make it optional, I guess you
can, but it seems unlikely anyone would want different behavior from the code
formatter, which is already manipulating whitespace heavily, after all.

Make it an option if you must, but please default it on, for consistency with
the historical Eclipse behavior and common sense.  (Who really wants that
newline stripped in the first place?  It seems to have happened by accident.)
Comment 16 Olivier Thomann CLA 2004-08-18 13:27:08 EDT
*** Bug 69129 has been marked as a duplicate of this bug. ***
Comment 17 Olivier Thomann CLA 2004-08-18 13:35:48 EDT
I am investigating how to add this extra line terminator. In Eclipse 2.1, no
line terminator is added if there is none. It would keep it, if there is one,
but it didn't add one for free.
Eclipse 3.0 always removes it. So this is the behavior that I will try to get
back. It won't be optional. The final line termination will be preserved if
present, but won't be added if absent. This could however be a new option.
Is this fine for you?
Comment 18 Boris Boehlen CLA 2004-08-19 02:09:58 EDT
For me the described behaviour would be ok. An option for adding a newline at 
the end of not present would be great. 
 
However, I think bug #69129 is only partially a duplicate. Trailling spaces 
are  beyond this bug report I think. Thus, I'd like to see it reopened. 
Comment 19 Olivier Thomann CLA 2004-08-19 16:06:26 EDT
The behavior descrived in comment 17 is now released.
Fixed and released in HEAD.
Regression tests added and updated existing tests.
Comment 20 Olivier Thomann CLA 2004-08-20 11:59:51 EDT
*** Bug 69129 has been marked as a duplicate of this bug. ***
Comment 21 Olivier Thomann CLA 2004-08-21 09:56:10 EDT
*** Bug 72069 has been marked as a duplicate of this bug. ***
Comment 22 Olivier Thomann CLA 2004-08-26 15:22:31 EDT
*** Bug 72698 has been marked as a duplicate of this bug. ***
Comment 23 Nick Crossley CLA 2004-09-20 12:13:29 EDT
I do not see this bug listed in the release notes for 3.0.1 - was it included?  If not, why not?
Comment 24 Olivier Thomann CLA 2004-09-20 12:19:00 EDT
No, it has not been backported to 3.0.1, because it has a new code formatter
options and it cannot be added in 3.0.1. 3.0.x cannot have any API change.
What could be backported is the 2.1 behavior without the new option.
Comment 25 Per Bothner CLA 2004-09-20 13:32:37 EDT
Most of us don't care about a new UI option to change a default, as long as the
default behavior is correct.  And what makes most sense for the default is to
remove all tailing whitespace and then add a final newline.  I really don't see
why anyone would want anything else, but at the very least the default must not
remove an existing final newline.
Comment 26 Olivier Thomann CLA 2004-09-20 13:38:28 EDT
This is a bug because it is a regression compare to what was done in 2.x.x. In
2.x.x, no final line were added when none were present in the source code. But
it was not removed, when one was present.
This is what will be done in 3.1, except that a new option is provided to add
one if none is present.
What could be backported to the 3.0.x maintenance stream is a behavior like
2.x.x. Nothing else. It would be a regression to add a line break if none is
present.
I reopen it for consideration in 3.0.2. I won't change what has been done in 3.1
stream.
Comment 27 Trevor Robinson CLA 2004-09-20 13:53:12 EDT
I'd like to second fixing this bug for 3.0.2 (2.x behavior; no UI change). This
bug is _really_ annoying, and having to wait for 3.1 for a fix would be really
painful. I use the code formatting feature a lot, but don't want to lose the
final line terminator, so I've made it a habit to follow Ctrl-Shift-F with
Ctrl-End/Return, but then I have to manually scroll back to wherever I was
editing if I want to make more changes.
Comment 28 Olivier Thomann CLA 2004-09-21 13:51:42 EDT
Will be backported to 3.0.2. The milestone will be changed when 3.0.2 tag is
available.
Comment 29 Olivier Thomann CLA 2005-01-26 12:49:17 EST
Candidating fix for 3.0.2 release (not committed yet)
Comment 30 Philipe Mulet CLA 2005-01-26 16:15:38 EST
Fix is not available yet.
Comment 31 Olivier Thomann CLA 2005-01-27 17:44:06 EST
Fixed and released in maintenance branch.
All tests have been updated and they all passed.
Comment 32 David Audel CLA 2005-03-10 06:31:52 EST
Verified with M20050216
Comment 33 ryenus ' CLA 2005-03-12 15:08:15 EST
I should have got this ticket into my sight earlier,
Anyway, many thanks to Deven T. Corzine for the over-half-year
battle, I was also annoyed by the disliking behavior,
thanks to you all for bringing good behavior back!