Bug 276257 - [relengtool][regression] "Fix copyrights" can destroy XML files by not closing comment properly and introducing wrong line delimiters
Summary: [relengtool][regression] "Fix copyrights" can destroy XML files by not closin...
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Releng (show other bugs)
Version: 3.5   Edit
Hardware: All All
: P3 major with 1 vote (vote)
Target Milestone: ---   Edit
Assignee: Platform-Releng-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: helpwanted
: 329062 (view as bug list)
Depends on: 276255
Blocks: 381147
  Show dependency tree
 
Reported: 2009-05-14 03:41 EDT by Martin Oberhuber CLA
Modified: 2014-07-24 12:37 EDT (History)
9 users (show)

See Also:


Attachments
Patch to disable copyright update in xml (2.00 KB, patch)
2010-11-02 16:32 EDT, John Arthorne CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Oberhuber CLA 2009-05-14 03:41:34 EDT
Build ID: 3.5M7

I had 3 existing plugin.xml files with existing IBM Copyright headers. Here is one of them before being processed by "Fix Copyrights":

http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.tm.rse/plugins/org.eclipse.rse.subsystems.files.ftp/plugin.xml?revision=1.36&root=DSDP_Project&view=markup

When running the "Fix Copyrights" action, the Copyright year was reved up to 2009 (good!), the entire Copyright comment was indented by 4 characters (not so good but OK), and while indenting the comment close marker (-->) was inserted a second time (BAD!!).

As a result, the file looked like this:

<?xml version="1.0" encoding="UTF-8"?>
<!--
     Copyright (c) 2006, 2009 IBM Corporation and others.
     (... yada yada ...)
     -->
 -->


Now this is incorrect XML, which was fortunately detected in my case by the builder since it automatically processes plugin.xml files; but in the case of XML files not directly processed by a builder, it is very easy to overlook this incorrect syntax and run into bad trouble.

This issue is probably related to bug 276255 - the general change / regression which led to inserting space characters into existing Copyright headers since recently.
Comment 1 Gunnar Wagenknecht CLA 2009-05-14 03:52:40 EDT
Note, as a workaround you can disable XML file processing in the preferences.
Comment 2 Gunnar Wagenknecht CLA 2009-05-14 03:58:55 EDT
Martin, I tried with the file you mentioned in a fresh 3.5M7 with the releng plug-in from HEAD. I'm unable to reproduce the issue.

Are you running on Windows or Linux?
Do you have special line wrapping settings?
Comment 3 Martin Oberhuber CLA 2009-05-14 04:09:51 EDT
I'm on Windows.
org.eclipse.releng.tools_3.4.100.v20090305 from Eclipse 3.5m7.
No special line end settings that I would be aware of.

Note that it's only an issue with "Fix Copyrights" but not with "Advanced Fix Copyrights". In your testing, did you have a dummy CVS History? Did the Copyright tool even touch your file? It may only want to touch the file when you commit it to (some dummy) CVS in 2009.

I'll try again with releng.tools HEAD in a fresh workspace...
Comment 4 Gunnar Wagenknecht CLA 2009-05-14 04:23:31 EDT
I tried different things. I checked out the file from CVS using a tag/version from March this year but with the old copyright year. I also tried with forcing a new revision year. 

Note, I always right clicked on the file and selected "Fix Copyright". 
Comment 5 Gunnar Wagenknecht CLA 2009-05-14 04:25:30 EDT
Ahh.. so I was able to reproduce using the "Eclipse/IBM Fix Copyrights..." action. However, I did not touch this part because I thought it's legacy. Are there any benefits in using this actions compared to the other?
Comment 6 Martin Oberhuber CLA 2009-05-14 04:28:56 EDT
(In reply to comment #5)
> Ahh.. so I was able to reproduce using the "Eclipse/IBM Fix Copyrights..."

In my version of the Releng plugin, I don't have this action.
As said, this is releng.tools from 3.5m7
I'm in the JDT perspective, Package Explorer, selecting a project.

I only see "Fix Copyrights..." as well as "Advanced Fix Copyrights". Same in Resource Perspective / Project Explorer.

Looks like there's a couple of unreleased changes in the releng.tools plugin that I'd really want to have? Are there any docs / release notes what has been changed?
Comment 7 Gunnar Wagenknecht CLA 2009-05-14 04:38:28 EDT
Only the CVS history. It seems that it was renamed recently (6 days ago) by Michael.
Comment 8 Martin Oberhuber CLA 2009-05-14 05:08:42 EDT
(In reply to comment #5)
> Are there any benefits in using this actions compared to the other?

I haven't looked at this for a while, but from the top of my head...
  * The Eclipse/IBM one has more CVS integration, e.g. if the last commit
    comment includes "copyright" the file is not considered so plain 
    copyright updates don't count

  * The Eclipse/IBM one always uses hardcoded IBM copyright owner
    -- which is beneficial sometimes because the other one only considers
    ONE copyright owner from the Prefs, so I use the other one for 
    Wind River stuff and the Eclipse/IBM one for Eclipse/IBM stuff

  * At one point in the past, the Eclipse/IBM one considered more file types

I think what I'd really like to see is a generic "Fix Copyrights" action
that updates the Copyright year independent of the Copyright owner. In the
TM/RSE codebase we have IBM, Symbian, PalmSource, MontaVista, Wind River
and more contributors as initial Copyright owners... and all of them are 
under active development so they need rev up of copyright year occasionally.

For the case where *no* existing copyright is found, such an action could
do a no-op and only report the file / project in question. Then the currently
existing actions can be applied in a second pass.

Such a generic action that only matches the (comment) Copyright (year), (year)
pattern and *not* the actual license would also apply to non-EPL contents, 
such as the EDL we have in TCF.

But that should probably be discussed in another bug...
Comment 9 John Arthorne CLA 2010-11-02 15:42:45 EDT
*** Bug 329062 has been marked as a duplicate of this bug. ***
Comment 10 David Carver CLA 2010-11-02 15:49:12 EDT
(In reply to comment #9)
> *** Bug 329062 has been marked as a duplicate of this bug. ***

So any time frame when this will be fixed.  Ideally it should have been fixed as soon as the regression was detected.
Comment 11 John Arthorne CLA 2010-11-02 16:27:34 EDT
(In reply to comment #10)
> So any time frame when this will be fixed.  Ideally it should have been fixed
> as soon as the regression was detected.

Yes, and ideally the other 20,000 open bugs/enhancements in Eclipse would also be fixed instantly.

In all seriousness I think we should just disable support for fixing copyrights in xml files if we can't do it safely.
Comment 12 John Arthorne CLA 2010-11-02 16:32:12 EDT
Created attachment 182253 [details]
Patch to disable copyright update in xml

Here is a patch to avoid corruption of XML files until this can be fixed properly.
Comment 13 Kim Moir CLA 2010-11-02 16:42:49 EDT
Thanks John, patch released.
Comment 14 Dani Megert CLA 2011-05-05 08:45:55 EDT
See also bug 215224.
Comment 15 Dani Megert CLA 2012-05-31 07:21:09 EDT
Note that the fix for bug 215224 is not enough to fix the wrong line delimiter issue for XML files.
Comment 16 Leo Ufimtsev CLA 2014-07-15 12:56:06 EDT
I have reproduced and fixed a range of xml issues including the original problem Martin Oberhuber reported. 

To my knowledge, I could not reproduce any further issues. 

Gerrit review:
https://git.eclipse.org/r/#/c/29921/

Could someone please kindly review and let me know if I can improve the code further. 

--------- Test Details 
I tested with 200 xml files that I picked out from various eclipse-plugins. 
Test notes:
 * [x] Empty Document
 * [x] Empty Document with XML header
 * [x] Document with content, no XML header.
 * [x] Document with XML header and content on 2nd line
 * [x] Document with XML header, copyright on first line with content on first line.
 * [x] Document with XML header, content on the first line that doesn't close properly.
 * Example:
 *      <?xml version="1.0" encoding="UTF-8"?><fragment><extension
 *     //Copy-right comment is inserted correctly, '<fragment ..' is put onto new line.
 *
 * [x] Document with XML header, copyright on 2nd line, stuff.
 * [x] test with non-IBM header.


--------- Fix Details 

. Fixed Breackage on XML files with xml header
	- fixed insertion offset bug. (+2 instead of +1)
 (see 
https://git.eclipse.org/r/#/c/29921/3/bundles/org.eclipse.releng.tools/src/org/eclipse/releng/tools/XmlFile.java 
 Line #171)

. Style fix. Copyright comment is now inserted on a new line, not on the same line
as xml header (if header is present).

. Fixed Breackage on XML files with existing copyright notice where copyright comment is on
the first line with an xml header.
  e.g: <?xml ... ?> <---

. Fixed Incorrect 'comment not on first line' warning for xml files.
      - XML copyright comments can start on non-first line as XML files can have meta head(ers).
. Removed'BUGS_FIXED' variable from xml_file and it's references.

. Added some comments here and there
Comment 17 David Williams CLA 2014-07-24 12:37:52 EDT
Gerrit patch merged into master.