Bug 38332 - Add hooks for additional compiler diagnostics such as internationalization warnings
Summary: Add hooks for additional compiler diagnostics such as internationalization wa...
Status: RESOLVED WONTFIX
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 2.1   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: JDT-Core-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-06-02 11:46 EDT by Ed Burnette CLA
Modified: 2009-08-30 02:12 EDT (History)
2 users (show)

See Also:


Attachments
Patch to add new I18n messages to the compiler (6.27 KB, patch)
2003-06-06 14:28 EDT, Ed Burnette CLA
no flags Details | Diff
Updated, more general patch for extending compiler warnings and changing the batch output format (14.24 KB, patch)
2003-06-13 13:47 EDT, Ed Burnette CLA
no flags Details | Diff
Repatched for latest HEAD sources (~3.0m2). (14.23 KB, patch)
2003-07-21 16:15 EDT, Ed Burnette CLA
no flags Details | Diff
Repatched for latest HEAD sources (~3.0m3) (10.51 KB, patch)
2003-09-18 12:08 EDT, Ed Burnette CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ed Burnette CLA 2003-06-02 11:46:21 EDT
1. Internationalization warnings

The Java tools help you to develop applications that can be run on 
international platforms. An important facet of designing a program for use in 
different countries is the localization, or externalization, of text that is 
displayed by the program.

1.1 String externalization

By externalizing strings, the text can be translated for different countries 
and languages without rebuilding the Java program. The JDT provides the 
following support for internationalization and string externalization:

[see online help for current support]

1.2 Character literals

Character literals are sometimes found in Java code and inhibit localization. 
For example, consider this code:

   // find next token
   for (i = 0; s.charAt(i) != ' '; i++) /* skip */ ; // wrong

The use of a literal blank is suspicious here because different languages may 
use many different characters for white space. Better would be:

   for (i = 0; !Character.isWhitespace(s.charAt(i)); i++) /* skip */ ;

The character literal warning is set to "Ignore" by default.


1.3 String concatenation

Strings are often built up from smaller strings in ways that may be natural to 
the programmer but don't match the order the user is used to in their 
localized language. For example, suppose you are trying to internationalize 
this line:

   System.out.println("Choose the " + (choice == RED ? "red" : "blue") + " 
pill"); // wrong


A naïve attempt to do this might be:

   String blue = bundle.getString("blue.txt"); // defined as "blue"
   String red = bundle.getString("red.txt"); // defined as "red"
   String neo1 = bundle.getString("choose.txt"); // defined as "Choose the"
   String neo2 = bundle.getString("pill.txt"); // defined as "pill"
   System.out.println(neo1 + (choice == RED ? red : blue) + neo2); // wrong


But that doesn't work because you're assuming a certain word order of 
adjectives before nouns and building up a string using concatenation in that 
order. Better would be to use two entirely different messages, or if that's 
not possible then use the Message.Format method with positional substitutions 
like this:

   String blue = bundle.getString("blue.txt"); // defined as "blue"
   String red = bundle.getString("red.txt"); // defined as "red"
   String neo = bundle.getString("choose.pill.txt"); // defined as "Choose the 
{0} pill"
   System.out.println(Message.Format(neo, { (choice == RED ? red : blue) }));

One more rule about concatenated strings. If either or both sides of the 
concatenation are string literals that have been marked as ok, then the 
concatenation is ok too. For example, this will not be considered a violation:

   Class.forName("org.eclipse.swt.internal."+names[i]+".OS"); //$NON-NLS-1$ //
$NON-NLS-2$

The string concatenation warning is set to "Ignore" by default.


1.4 Locale sensitive methods

Certain Java methods were developed before internationalization issues and 
locales were fully understood. Some of them have been deprecated but some 
remain. For example, Date.toString() always returns dates in the form "dow mon 
dd hh:mm:ss zzz yyyy". For example,

   myString = myDate.toString(); // wrong

Better would be to use the DateFormat.format() method like this:

   myString = DateFormat.getDateInstance().format(myDate); 

This will format the date as appropriate for the current locale.

The list of local sensitive methods is configurable. Details TBD.

The locale sensitive method warning is set to "Ignore" by default.


2. Supressing i18n warnings

Some uses of character and string literals, concatenations, and locale 
sensitive methods are ok and some are not. There is no way for the compiler to 
tell which is which. So it allows the programmer to mark the valid uses 
through comments and filters.


2.1 Comments

Comments may be placed in the source code to disable warnings and errors about 
specific code the compiler would otherwise treat as i18n violations. Spaces 
and case are significant in the comments. The /* */ style comments can appear 
anywhere on the line.

    /*I18nOK*/ or //$I18nOK$  - supress all I18n warnings on the line
    /*I18nOK:xx*/ or //$I18nOK:xx$ - same as /*I18nOK*/ - for OneRealm compat.
    //$NON-NLS-n$  - supress the nth non-externalized string warning on the 
line, backward compatible with Eclipse 2.1

2.2 String Filters

Using filters you can reduce the number of non-externalized string warnings 
reported by marking strings that match certain patterns to be ok. The 1.4 (?)
regex syntax is used. This is just a convenience - you could also mark them as 
ok with comments. For example, if you wanted to treat any strings that look 
like XML as ok you might use a filter like:

   <[A-ZA-z][A-Za-z0-9]*>

and then the following line of code would not be marked as an i18n violation 
whether or not it had a supression comment

   s.append("<html>");

Details TBD

2.3 Safe methods

Designating some methods as safe methods will also reduce the number of 
warnings you get. A safe method is one where string literals may be passed 
safely. The classic example of this is resource access like:

   s = resources.getString("Search.Error.title");

It would be better to put a supression comment on each line where this occurs 
so that the reader of the code will not have to wonder about whether this is 
ok or not. However, if that's not practical, then you can add the getString 
method to the safe methods list.

Details TBD
Comment 1 Philipe Mulet CLA 2003-06-03 06:49:56 EDT
Feels like the complexity is quite big, and would almost need to be a companion 
of the compiler, rather than the current simple check performed during compile.

However, these are nice suggestions indeed.
Comment 2 Ed Burnette CLA 2003-06-03 12:13:19 EDT
Well, we thought about that, and tried several programs that were "companions" 
assuming by that you mean "separate programs". However I didn't like any of 
them for several reasons. One was that they weren't integrated as well as the 
2.1 non-externalized string warning. When you turn that one on, you get wavy 
lines under the string in the JDT UI, along with your standard problem 
indicator (with quick fix hook, task, resource decorations, annotations, and 
all that for free). Also you always get it - there's no separate builder to 
either run by hand or enable or configure.

Another thing we wanted to avoid was the overhead of separate programs re-
parsing the code, and the nightmare of getting the classpath right. JDT's 
compiler already parses everything, and it already does the classpath, and it 
already has a mechanism for reporting problems, so it was a natural to add the 
warnings to the compiler itself.

The actual changes to the compiler are minimal. Here's a summary of what we 
did:

   org/eclipse/jdt/internal/compiler/Compiler.java - added field for the 
visitor and a conditional call to unit.traverse
   org/eclipse/jdt/internal/compiler/batch/Main.java - added -checker option 
to define the class of the i18n checking visitor
   org/eclipse/jdt/core/compiler/IProblem.java - added Ids for 4 NLS warnings
   org.orig/eclipse/jdt/internal/compiler/problem/ProblemReporter.java - added 
the 4 NLS warnings
   org/eclipse/jdt/internal/compiler/problem/messages.properties - added 
English text for the 4 warnings

The 'Hunter' class that checks for the i18n warnings is about 800 lines 
currently.

I'm open to better ideas as long as the tight integration with other compiler 
problems is maintained.
Comment 3 Philipe Mulet CLA 2003-06-03 13:36:37 EDT
This would go along the line I had in mind. What I would like is a way to 
abstract these advanced features in some other component, though it is 
desirable that you *just* contribute more problems during standard compile 
process.

Ideally, most of our advanced style warnings could be promoted elsewhere, with 
appropriate callbacks. Though extracting the information could become expensive.

Could you attach your changes as a patch for consideration ?
Comment 4 Ed Burnette CLA 2003-06-06 14:28:44 EDT
Created attachment 5100 [details]
Patch to add new I18n messages to the compiler

In addition, in org/eclipse/jdt/internal/compiler/problem/messages.properties ,
added these lines...

423 = I18N:COS String concatenation 
424 = I18N:LSM Locale sensitive method use
426 = I18N:EMS Embedded String Literal
427 = I18N:EMC Embedded Character Literal
Comment 5 Ed Burnette CLA 2003-06-13 13:47:17 EDT
Created attachment 5188 [details]
Updated, more general patch for extending compiler warnings and changing the batch output format

This patch is against the current HEAD (3.0), and adds a -checker and a
-requestor option, both of which take class names. The -checker class gets a
chance to walk the parse tree and add new problems, and the -requestor class,
if specified, replaces the internal batch requestor class (we use it to
generate xml which is transformed into I18n reports).
Comment 6 Ed Burnette CLA 2003-07-21 16:15:55 EDT
Created attachment 5523 [details]
Repatched for latest HEAD sources (~3.0m2).

Updated with latest source changes and addressed any conflicts.
Comment 7 Ed Burnette CLA 2003-09-18 12:08:12 EDT
Created attachment 6147 [details]
Repatched for latest HEAD sources (~3.0m3)
Comment 8 Ed Burnette CLA 2003-09-18 12:11:16 EDT
I've refreshed the patch and simplified it a little. Changes are minor.

Since the patch we provided is just for the hooks and not the checks 
themselves I've updated the title.
Comment 9 Philipe Mulet CLA 2003-09-25 12:11:11 EDT
The hooks could make sense, however I am afraid of polluting the message 
catalogs/problem IDs with new entries.

You could work around this by using your own problem reporter to deal with 
these... 
Comment 10 Ed Burnette CLA 2003-09-25 12:36:15 EDT
I'm not sure how that would work: How would I hook into DefaultProblemFactory 
to get the right messages (it hardcodes the bundle 
name "org.eclipse.jdt.internal.compiler.problem.messages"), and how would I 
get a superclass of ProblemReporter inserted?
Comment 11 Philipe Mulet CLA 2003-09-25 12:51:48 EDT
I did not mean it was doable at this moment, but it should go into this 
direction. Allowing other to plug in our compiler makes sense from a tool 
standpoint, but having to host their error messages is somewhat trouble. So it 
should be more open to such additions.
Comment 12 Ed Burnette CLA 2003-09-25 12:59:50 EDT
+1 Agreed. If you'd like to create a design for that I'd be happy to modify 
the patch to conform to it.
Comment 13 Philipe Mulet CLA 2004-03-25 06:40:48 EST
Will reconsider post 3.0
Comment 14 Ed Burnette CLA 2004-09-18 22:37:47 EDT
It's post 3.0, can this be reopened? Thx.
Comment 15 Denis Roy CLA 2009-08-30 02:12:46 EDT
As of now 'LATER' and 'REMIND' resolutions are no longer supported.
Please reopen this bug if it is still valid for you.